Hossein Rajaby Faghihi

About Me

I am Ph.D. candidate (soon to graduate) at Michigan State University, majoring in natural language processing. My research is focused on combining reasoning tools and deep neural networks in conjunction with integrating background knowledge. My research is more closely related to spatial-temporal reasoning, procedural reasoning, Large Language Models, Question-Answering, and information extraction. I have been a reviewer for top-tier NLP venues such as ACL , NAACL, EMNLP, and broader conferences such as IJCAI and AAAI. I have also been involved in organizing neuro-symbolic workshops such as CLeaR (Combining Learning and Reasoning). I obtained my Ms.c. at Sharif University of Technology on the topic of integration of Machine Learning and Symbolic knowledge in the scope of IOT. My Bs.c. degree is from AmirKabir University of Technology.

Education

Ph.D.2019-2023

Computer Science - NLP

Michigan State University

Thesis: Exploiting Semantic Structures Toward Procedural Reasoning

Focus: NLP, AI, Machine Learning, Neuro-Symbolic AI, LLM, Neural Logical Reasoning
MSc2016-2018

Computer engineering

Sharif university of technology

Thesis: Hybrid Learning Approach Toward Situation Recognition and Handling
BSc2012-2016

Computer Science

Amirkabir University of technology

GPA: 3.94

Recent Experiences

08/2019 - Current
Graduate Research Assistant
Michigan State University
- Led a team of more than 12 researchers in designing and implementing a neuro-symbolic framework using PyTorch, leveraging Declarative Learning-Based Programming language to integrate domain knowledge into neural networks effectively.
- Developed cutting-edge models to enhance the comprehension of procedural texts by pretrained language models, resulting in improved procedural reasoning capabilities.
- Created and curated challenging tasks and datasets to evaluate language models' spatial reasoning capabilities.
- Employed language models~(Roberta, T5, GPT-X) and adapted them to solve complex reasoning tasks.
- Published more than 6 lead-authored research papers at major conferences, such as NAACL 2021, EMNLP 2021, ACL 2020, AAAI 2023, and EACL 2023.
05/2022 - 08/2022
Research Scientist Intern
Apple
- Improving the performance of existing models for Siri intent detection by 30% through various techniques. This involved addressing data imbalance, applying data clustering based on sentence representations, and leveraging language models.
- Contributed to developing novel data augmentation techniques to enhance the performance and robustness of Siri intent detection.
05/2021 - 08/2021
Research Scientist Intern
Dataminr
- Led the design and implementation of a knowledge-based clustering pipeline for timeline extraction and summarization of local crisis events on Twitter.
- Published a groundbreaking paper as the lead author at EMNLP 2022, introducing a novel timeline extraction method and benchmark.
08/2016 - 06/2019
CEO
Vestaak
- I founded a company specializing on training young talent in Programing and Machine Learning and introducing them to companies requiring new engineers
- I created a business plan which had two sides, one, we would develop and maintain Projects for startups, two we would introduce trained engineers to big companies.
11/2017 - 07/2019
CTO
Vanda Andisheh Amirkabir
- I was in charge of the technical team to lead Web application and Mobile development.
- I was responsible to create technical plans and identify the software requirements for implementing E-Education services

Awards

Jackson Data & Impact Award2022

Declarative Neuro-Symbolic Learning and Reasoning Framework
Best Poster2022

AI Symposium

University of Michigan

Spartqa: A textual question answering benchmark for spatial reasoning
Best Poster2022

Graduate Symposium

Michigan State University

Spartqa: A textual question answering benchmark for spatial reasoning
Best Poster2021

AI Symposium

University of Michigan

Domiknows: A library for integration of symbolic domain knowledge in deep learning

Research Summary

My research focus in on enhancing the reasoning capabilities of deep learning models, specially the models for natural language understanding.

During my time at Michigan State University, I have published many research papers on the reasoning and deep learning on topics including: Procedural Reasoning, Spatial Reasoning, Question-Answering, Semantic Parsing, Neuro-Symbolic AI, Domain Knowledge Integration, and more.

In addition to my ongoing research at MSU, I have also been involved in application focused research at both Apple, and Dataminr. I have contributed in improving the underlying Siri natural language understanding at Apple. I have also published a research paper, introduced novel techniques and benchmarks at Dataminr to extract coherent disaster stories from Social Media.

Interests

Knowledge Reasoning and Representation
Neuro-Symbolic AI
Integration of Domain Knowledge with deep neural modeling
Neural Logical Reasoning
Large Foundation Models (i.e., LLMs)
Intelligent Human Computer Interaction
Learning User Behaviour and Recommendation systems
Procedural and Spatial Reasoning in natural language
Multi-Modal reasoning
Artificial Intelligences
Machine Learning

Research Projects

Click on each item to learn more!

Prompt 2 Model 2023

Generating Complex Neural Architectures from Natural Language Prompts
Research

The goal of this project is to facilitate the development of custom architectures by simply prompting Large Language Models. We explore a series of techniques for mapping the prompts into declarative programing frameworks for the designing custom neural architectures which can be aligned with a symbolic knowledge space.
Machine Learning AI LLM Prompt Engineering Dynamic Retrieval Neuro-Symbolic AI NLP
Procedural Reasoning 2020 - 2023

Understanding world models and their evolution in a Procedural text/multi-modal data.
Research

In this research project which led to multiple publications in top-tier conferences, we explore the capability of deep learning models on understanding the evolution of events and actions in a process. The understanding includes but is not limited to identifying important actions, the effect of actions on involved entities and the goal and objective of the process. Within this project, we mainly focus on improving the capability of Language Models through knowledge augmentation, connecting to symbolic knowledge, incorporating domain knowledge and similar techniques.
Machine Learning AI Procedural Reasoning Reasoning Neuro-Symbolic AI NLP Vision
DomiKnows 2019 - 2023

Declarative Learning-based Programming
Research

DomiKnowS is a library for integrating domain knowledge in deep learning architectures. Using this library, the data structure is expressed symbolically via graph declarations, and the logical constraints over outputs or latent variables can be seamlessly added to the deep models. The domain knowledge can be defined explicitly, improving the models' explainability and the performance and generalizability in the low-data regime. Several approaches for integrating symbolic and sub-symbolic models have been introduced; however, there is no library to facilitate the programming for such integration in a generic way while various underlying algorithms can be used. Our library aims to simplify programming for such integration in both training and inference phases while separating the knowledge representation from learning algorithms. We showcase various NLP benchmark tasks and beyond. The framework is publicly available on GitHub (https://github.com/HLR/DomiKnowS).
Research Machine Learning AI Ontology Domain Knowledge Knowledge Graph Neuro-Symbolic AI NLP Vision
Social Media Crisis Stories 2021

Automatic Extraction of disaster events from Twitter
Research

In this project which led to a publication in EMNLP2022, we propose a novel approach to extract relevant information on disaster events from Twitter. Here, we go beyond just topic detection and extract coherent stories on local scale from the Tweets. We further provide a dataset for understanding the relevancy of information, their importance of the content and further how to summarize the disaster stories extracted from Twitter.
Machine Learning AI LLM Social Media Clustering Neuro-Symbolic AI NLP
Spatial Reasoning 2021

Understanding spatial relations from natural language
Research

In this project, which led to a paper published in NAACL'2021, we aim to explore the capabilities of SOTA models in understanding complex multi-hop spatial relationships described in natural language. We provide synthetic and human-generated datasets for mutli-hop spatial reasoning, evaluate the capability of models to transfer their learning from the synthetic set to other datasets, and further analyse a series of baselines on their spatial understanding.
Machine Learning AI Spatial Reasoning Reasoning Neuro-Symbolic AI NLP Vision
MS.c Thesis 2017 - 2019

A Situation Aware Framework for IOT Environments
Research

We propose a novel hybrid learning approach to gain situation awareness in smart environments by introducing a new situation identifier that combines an expert system and a machine learning approach. Traditionally, expert systems and machine learning approaches have been widely used independently to detect ongoing situations as the main functionality in smart environments in various domains. Expert systems lack the functionality to adapt the system to each user and are expensive to design based on each setting. On the other hand, machine learning approaches fail in the challenge of cold start and making explainable decisions. Using both of these approaches enables the system to use user’s feedback and capture environmental changes while exploiting the initial expert knowledge to solve the mentioned challenges. We use decision trees and situation templates as the core structure to interpret sensor data. To evaluate the proposed method, we generate a new human-annotated dataset simulating a smart environment. Our experiments show superior results compared with the initial expert system and the machine learning approach while preserving the initial expert system’s interpretability.
Research IOT Game Theory Machine Learning Ontology XML Hybrid Learning Neuro-Symbolic AI
Paper Crawler & Integrator 2018

Paper and professors Crawler & Integrator from university websites and paper libraries.
Project

The project was about finding papers related to professors from certain universities integrated data from professors and store them into data warehouses. the project was performed using python, scrapy and selenium.
Python Crawler Data Science
Automatic Resume Analyzer and Interview caller 2018

built with python and telegram cli.
Project

analysing received resumes and assign matching point to each one. invite the top job seekers to interview using telegram cli.
Python Crawler Data Science Telegram CLI
Social Media Topic Detector 2017

Detecting Telegram channel Topic from texts
Project

The project is a telegram crawler which finds out the topic of telegram channels. channels are discovered by the robot itself from groups and other channels.
Python Machine Learning Data Science Telegram CLI
Enhanced Resume Analyzer 2017

Resume Analyzer from Pdf with the help of Machine Learning Techniques.
Project Research

the project was about extracting information into categories from pdf resume files.
Python Crawler Data Science Machine Learning
Smart Apply Suggester 2018

Suggesting Apply Destination for master and phd
Project Research

The project was built to help students find out their chance to apply in universities. the process was about comparing user data with previous data from universities containing a map between universities and inference from them.
Python Crawler Data Science Machine Learning Data Mining
Academic Partnership Recommender 2018

Data extraction from paper and writer dataset
Project Research

the issues were cleaning data and extracting some historical data at first to detect noise points and etc. then we made predictions on count of numbers from professors and suggesting partners based on previous papers.
Research Data mining
Semantic Web survey and imorovement - NLP 2017

Domain Specific ontology for aspect based opinion mining
Research

in this paper we discussed previous solutions for opinion mining and improved them using domain specific ontology. the ontology helped us in many cases like aspect analysis and similar words.
Research Ontology Opinion Mining NLP
Bachelor of Science Thesis 2016

Designing an E-Learn system including encouragement tactics and crowd source questions
Research Project

the mission was to design an E-Learning platform for students and teachers collabrating togheter in term of courses and classes. teacher questions are crowdsourced and the point assigning process was crowdsourced among students either. students was encouraged to continue solving problems with feedback.
Thesis Research Project Web Developing Laravel Crowd sourcing
Designing Google Plus (Mock) 2015

Designing and Coding a social media like Google plus
Project

The project was about coding a complete social media including administrator. the stack of project was html,css and javascript for front end developing and php, mysql for back end developing.
Programming HTML CSS Javascript PHP Database Mysql Social Media
Database Design 2015

Designing social media database structure and administrator phase
Project

the project was about designing and efficient database structure for social media including query building for several common request. designing complex queries and creating trigger events.
Programming Mysql Postgres PHP Database PHP Mysql Social Media
Compiler Fundamentals - Java Compiler 2014

Designing and Coding MinI Java Compiler
Project Research

The project was about designing and coding a minimal java compiler. the process began from lexical phase to semantic phase of compiler. the compiler itself was written is C++ language
Programming C++ Compiler
Advanced Programming - Student Registration Service using C++ 2013

Designing and Coding a functional student, teacher and course registration system.
Project

The project was about making a course selection system for students to register. each student had an access code to login and then commands to list the courses, select a course, get the selected courses list, remove lists and etc. each teacher had a access code, either. the teacher was able to register a new course and get the list of students registered on course. user interface was designed as a terminal command line.
Programming C++ System Design Object oriented Programming

Publications

The Role of Semantic Parsing in Understanding Procedural Text

Hossein Rajaby Faghihi , Parisa Kordjamshidi, Choh Man Teng, James Allen

Conference PaperEACL' 2023

Abstract

In this paper, we investigate whether symbolic semantic representations, extracted from deep semantic parsers, can help reasoning over the states of involved entities in a procedural text. We consider a deep semantic parser (TRIPS) and semantic role labeling as two sources of semantic parsing knowledge. First, we propose PROPOLIS, a symbolic parsing-based procedural reasoning framework. Second, we integrate semantic parsing information into state-of-the-art neural models to conduct procedural reasoning. Our experiments indicate that explicitly incorporating such semantic knowledge improves procedural understanding. This paper presents new metrics for evaluating procedural reasoning tasks that clarify the challenges and identify differences among neural, symbolic, and integrated models.

GLUECons: A Generic Benchmark for Learning under Constraints

Hossein Rajaby Faghihi , Aliakbar Nafar, Chen Zheng, Roshanak Mirzaee, Yue Zhang, Andrzej Uszok, Alexander Wan, Tanawan Premsri, Dan Roth, Parisa Kordjamshidi

Conference PaperAAAI' 2023

Abstract

Recent research has shown that integrating domain knowledge into deep learning architectures is effective; It helps reduce the amount of required data, improves the accuracy of the models' decisions, and improves the interpretability of models. However, the research community lacks a convened benchmark for systematically evaluating knowledge integration methods. In this work, we create a benchmark that is a collection of nine tasks in the domains of natural language processing and computer vision. In all cases, we model external knowledge as constraints, specify the sources of the constraints for each task, and implement various models that use these constraints. We report the results of these models using a new set of extended evaluation criteria in addition to the task performances for a more in-depth analysis. This effort provides a framework for a more comprehensive and systematic comparison of constraint integration techniques and for identifying related research challenges. It will facilitate further research for alleviating some problems of state-of-the-art neural models.

CrisisLTLSum: A Benchmark for Local Crisis Event Timeline Extraction and Summarization

Hossein Rajaby Faghihi , Bashar Alhafni, Ke Zhang, Shihao Ran, Joel Tetreault, Alejandro Jaimes

Conference PaperEMNLP' 2022

Abstract

Social media has increasingly played a key role in emergency response: first responders can use public posts to better react to ongoing crisis events and deploy the necessary resources where they are most needed. Timeline extraction and abstractive summarization are critical technical tasks to leverage large numbers of social media posts about events. Unfortunately, there are few datasets for benchmarking technical approaches for those tasks. This paper presents , the largest dataset of local crisis event timelines available to date. contains 1,000 crisis event timelines across four domains: wildfires, local fires, traffic, and storms. We built using a semi-automated cluster-then-refine approach to collect data from the public Twitter stream. Our initial experiments indicate a significant gap between the performance of strong baselines compared to the human performance on both tasks. Our dataset, code, and models are publicly available (https://github.com/CrisisLTLSum/CrisisTimelines).

DomiKnowS: A Library for Integration of Symbolic Domain Knowledge in Deep Learning

Hossein Rajaby Faghihi , Quan Guo, Andrzej Uszok, Aliakbar Nafar, Parisa Kordjamshidi

Conference PaperEMNLP' 2021

Abstract

We demonstrate a library for the integration of domain knowledge in deep learning architectures. Using this library, the structure of the data is expressed symbolically via graph declarations and the logical constraints over outputs or latent variables can be seamlessly added to the deep models. The domain knowledge can be defined explicitly, which improves the explainability of the models in addition to their performance and generalizability in the low-data regime. Several approaches for such integration of symbolic and sub-symbolic models have been introduced; however, there is no library to facilitate the programming for such integration in a generic way while various underlying algorithms can be used. Our library aims to simplify programming for such integration in both training and inference phases while separating the knowledge representation from learning algorithms. We showcase various NLP benchmark tasks and beyond. The framework is publicly available at Github(https://github.com/HLR/DomiKnowS).

SPARTQA: A Textual Question Answering Benchmark for Spatial Reasoning

Roshanak Mirzaee, Hossein Rajaby Faghihi , Qiang Ning, Parisa Kordjamshidi

Conference PaperNAACL' 2021

Abstract

This paper proposes a question-answering (QA) benchmark for spatial reasoning on natural language text which contains more realistic spatial phenomena not covered by prior work and is challenging for state-of-the-art language models (LM). We propose a distant supervision method to improve on this task. Specifically, we design grammar and reasoning rules to automatically generate a spatial description of visual scenes and corresponding QA pairs. Experiments show that further pretraining LMs on these automatically generated data significantly improves LMs’ capability on spatial understanding, which in turn helps to better solve two external datasets, bAbI, and boolQ. We hope that this work can foster investigations into more sophisticated models for spatial reasoning over text.

Time-stamped language model: Teaching language models to understand the flow of events

Hossein Rajaby Faghihi , Parisa Kordjamshidi

Conference PaperNAACL' 2021

Abstract

Tracking entities throughout a procedure described in a text is challenging due to the dynamic nature of the world described in the process. Firstly, we propose to formulate this task as a question answering problem. This enables us to use pre-trained transformer-based language models on other QA benchmarks by adapting those to the procedural text understanding. Secondly, since the transformer-based language models cannot encode the flow of events by themselves, we propose a Time-Stamped Language Model (TSLM) to encode event information in LMs architecture by introducing the timestamp encoding. Our model evaluated on the Propara dataset shows improvements on the published state-of-the-art results with a 3.1% increase in F1 score. Moreover, our model yields better results on the location prediction task on the NPN-Cooking dataset. This result indicates that our approach is effective for procedural text understanding in general.

Latent alignment of procedural concepts in multimodal recipes

Hossein Rajaby Faghihi, Roshanak Mirzaee, Sudarshan Paliwal, Parisa Kordjamshidi

Workshop PaperALVR (ACL)' 2020

Abstract

We propose a novel alignment mechanism to deal with procedural reasoning on a newly released multimodal QA dataset, named RecipeQA. Our model is solving the textual cloze task which is a reading comprehension on a recipe containing images and instructions. We exploit the power of attention networks, cross-modal representations, and a latent alignment space between instructions and candidate answers to solve the problem. We introduce constrained max-pooling which refines the max pooling operation on the alignment matrix to impose disjoint constraints among the outputs of the model. Our evaluation result indicates a 19% improvement over the baselines.

Inference-masked loss for deep structured output learning

Quan Guo, Hossein Rajaby Faghihi , Yue Zhang, Andrzej Uszok, Parisa Kordjamshidi

Conference PaperIJCAI' 2020

Abstract

Structured learning algorithms usually involve an inference phase that selects the best global output variables assignments based on the local scores of all possible assignments. We extend deep neural networks with structured learning to combine the power of learning representations and leveraging the use of domain knowledge in the form of output constraints during training. Introducing a nondifferentiable inference module to gradient-based training is a critical challenge. Compared to using conventional loss functions that penalize every local error independently, we propose an inferencemasked loss that takes into account the effect of inference and does not penalize the local errors that can be corrected by the inference. We empirically show the inference-masked loss combined with the negative log-likelihood loss improves the performance on different tasks, namely entity relation recognition on CoNLL04 and ACE2005 corpora, and spatial role labeling on CLEF 2017 mSpRL dataset. We show the proposed approach helps to achieve better generalizability, particularly in the low-data regime.

Hybrid Learning Approach Toward Situation Recognition and Handling

Hossein Rajaby Faghihi, MohammadAmin Fazli, Jafar Habibi

Journal PaperComputer Journal

Abstract

We propose a novel hybrid learning approach to gain situation awareness in smart environments by introducing a new situation identifier that combines an expert system and a machine learning approach. Traditionally, expert systems and machine learning approaches have been widely used independently to detect ongoing situations as the main functionality in smart environments in various domains. Expert systems lack the functionality to adapt the system to each user and are expensive to design based on each setting. On the other hand, machine learning approaches fail in the challenge of cold start and making explainable decisions. Using both of these approaches enables the system to use user’s feedback and capture environmental changes while exploiting the initial expert knowledge to solve the mentioned challenges. We use decision trees and situation templates as the core structure to interpret sensor data. To evaluate the proposed method, we generate a new human-annotated dataset simulating a smart environment. Our experiments show superior results compared with the initial expert system and the machine learning approach while preserving the initial expert system’s interpretability.

Skills

Expert Advance Intermediate beginner

Specialities:
Neuro-Symbolic AI Integrating Domain Knowledge and Deep Learning Procedural Reasoning Spatial-Temporal Reasoning Large Language Models Neural Logical Reasoning Generative AI Foundation Models Prolog/Problog
AI/Machine Learning:
Python Programming language Machine Learning Artificial Intelligence Deep Learning Sklearn library PyTorch HuggingFace library Transformers NLP Vision Multi-Modal Learning Clustering Supervised Learning Unsupervised Learning Ontology Data Mining HCI R programming language
Web Back End Developing:
PHP Laravel Django Django Rest Framework Spring Kotlin Adonis.js Async ServerSide Programming Linux
Database Design and Concepts:
Mysql PostgreSQL Sqlite SQL Server Mongo DB
Web Front End Design and Develop:
HTML CSS Javascript Jquery React Redux Sass Less Gulp Bootstrap Semantic UI Material UI

Contact & Social

phone: +1 - 517 - 303 21 72
h.faghihi15 at gmail.com
rajabyfa at msu.edu
in/hosseinfaghihi/
https://scholar.google.com/citations?user=S-GLfiIAAAAJ&hl=en