Logo
  • People
  • Research
  • Activities
  • Contact
Join us!

XAI Unit

Activities ▸ XAI Unit

We study how to make NLP models explainable, robust, and rationale.

Members

  • Chau Nguyen (Nguyen Lab) (Unit leader)
  • Kenshiro Tanaka
  • Yoshihiro Sakai
  • Naoya Inoue
  • Yufeng Zhao
  • Tien Dang-Huu
  • Akira Ishii
  • Mariko Kato

Meetings Log

2024/4-2025/3

Date
Presenter
Topic
2025/03/19
Kenshiro Tanaka

Paper reading: Presentation of 3 papers from NLP2025

2024/12/11
Naoya Inoue

Paper reading: The Super Weight in Large Language Models https://arxiv.org/abs/2411.07191

2024/11/27
Tien Dang Huu

Progress Report: Unlearning

2024/11/20
Naoya Inoue

Paper reading: Lennart et al. Truth is Universal: Robust Detection of Lies in LLMs https://arxiv.org/abs/2407.12831

2024/07/03
Chau Nguyen

Paper reading: Langedijk et al. DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers. NAACL2024.

2024/06/26
Naoya Inoue

Paper reading: Huh et al. The Platonic Representation Hypothesis. ICML2024.

2024/06/19
Mariko Kato

Paper reading: Bansal et al. Revisiting Model Stitching to Compare Neural Representations. NeurIPS2021.

2024/06/12
Kenshiro Tanaka

Progress report: Metacognitive reasoning

2024/05/29
Tien Dang Huu

Progress report: Machine Unlearning

2024/05/22
Naoya Inoue

Paper discussion: Hernandez et al. Linearity of Relation Decoding in Transformer Language Models. ICLR2024.

2024/05/15
Yoshihiro Sakai

Paper discussion: Kossen et al. In-Context Learning Learns Label Relationships but Is Not Conventional Learning. ICLR2024.

2024/05/08
Yufeng Zhao

Paper reading: Du et al. Understanding Emergent Abilities of Language Models from the Loss Perspective. arXiv2024.

2024/04/24
Tien Dang Huu

Paper reading: Li et al. 2024. The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning. arXiv2024. + Progress report

2024/04/17
Chau Nguyen

Paper reading: Ladhak et al. When Do Pre-Training Biases Propagate to Downstream Tasks? A Case Study in Text Summarization. EACL2023.

2024/04/10
Kenshiro Tanaka

Paper reading: Li et al. Inference-Time Intervention: Eliciting Truthful Answers from a Language Model. NeurIPS2023.

2024/04/03
Naoya Inoue

Paper reading: Dai et al. Knowledge Neurons in Pretrained Transformers. ACL2022.

2023/4-2024/3

Date
Presenter
Topic
2024/03/27
Kenshiro Tanaka

Paper reading: NLP2024 papers (Part 2)

2024/03/20
Yufeng Zhao

Paper reading: NLP2024 papers (Part 1)

2024/03/01
Daichi Haraguchi

Paper reading: Yamato et al. Evolution of metamemory based on self-reference to own memory in artificial neural network with neuromodulation. Nature2022.

2024/02/28
Kenshiro Tanaka

Progress report: Mid-term presentation

2024/02/21
Tien Dang Huu

Progress report: ACL submission

2024/01/31
Chau Nguyen

Paper Reading: Lanham et al. Measuring Faithfulness in Chain-of-Thought Reasoning. arXiv2023.

2024/01/24
Naoya Inoue

Progress report: Back-off LMKB

2024/01/10
Yufeng Zhao

Paper Reading: Wang et al. Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning. EMNLP2023.

12/20/2023
Daichi Haraguchi

Paper Reading: Hagstrom et al. The Effect of Scaling, Retrieval Augmentation and Form on the Factual Consistency of Language Models EMNLP2023.

12/13/2023
Tien Dang Huu

Progress report: Machine Unlearning

12/06/2023
Chau Nguyen

Paper Reading: Lampinen et al. Can language models learn from explanations in context? EMNLP2022.

11/08/2023
Yufeng Zhao

Survey report: Model Interpolation & Vectorial Model Editing.

11/01/2023
Yoshihiro Sakai

Paper Reading: Wu et al. Do PLMs Know and Understand Ontological Knowledge?. ACL2023.

10/26/2023
Daichi Haraguchi

Paper Reading: Meng et al. Locating and Editing Factual Associations in GPT. NeurIPS 2023.

10/12/2023
Tien Dang Huu

Paper Reading: Nguyen-Duc et al. Class based Influence Functions for Error Detection. ACL2023.

10/05/2023
Chau Nguyen

Paper Reading: Atanasova et al. Faithfulness Tests for Natural Language Explanations. ACL2023.

09/21/2023
Kenshiro Tanaka

Paper Reading: Besta et al. Graph of Thoughts: Solving Elaborate Problems with Large Language Models. arxiv2023.

09/14/2023
Naoya Inoue

Paper Reading: Hong et al. Faithful Question Answering with Monte-Carlo Planning. ACL2023.

09/07/2023
Daichi Haraguchi

Paper Reading: 坂井ら. 未知の知識に対する事前学習済み言語モデルが持つ推論能力の調査. 2023-NL-257.

08/31/2023
Chau Nguyen

Paper Reading: Hu et al. LoRA: Low-Rank Adaptation of Large Language Models. arxiv 2023.

08/24/2023
Daichi Haraguchi

Paper Reading: Wei et al. Simple synthetic data reduces sycophancy in large language models. arxiv 2023.

08/18/2023
Linh Hoai Luu

Practice Talk: Linh’s MS thesis defense

08/04/2023
Naoya Inoue

Paper Reading: Hao et al. Reasoning with Language Model is Planning with World Model. arxiv 2023.

07/21/2023
Naoya Inoue

Paper Reading: Wong et al. From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought. arxiv 2023.

07/06/2023
Chau Nguyen

Paper Reading: Stacey et al. Logical Reasoning with Span-Level Predictions for Interpretable and Robust NLI Models. EMNLP2022.

06/30/2023
Daichi Haraguchi

Paper Reading: Dziri et al. Faith and Fate: Limits of Transformers on Compositionality. arxiv 2023.

06/23/2023
Kenshiro Tanaka

Paper Reading: Jin et al. Evidence of Meaning in Language Models Trained on Programs. arxiv 2023.

06/09/2023
Naoya Inoue

Presentation: Large Language Models as Real-world Planner

06/02/2023
Chau Nguyen

Paper Reading: Yao et al. Tree of Thoughts: Deliberate Problem Solving with Large Language Models. arxiv 2023.

05/26/2023
Daichi Haraguchi

Paper Reading: Dai et al. Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers. ACL2023.

05/19/2023
Linh Hoai Luu

Paper Reading: Li et al. Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates. EMNLP2022.

05/12/2023
Naoya Inoue

Paper Reading: Zhou et al. Context-faithful Prompting for Large Language Models. arxiv 2023.

04/28/2023
Daichi Haraguchi

Paper Reading: Madaan et al. Self-Refine: Iterative Refinement with Self-Feedback. arxiv 2023.

04/21/2023
Chau Nguyen

Paper Reading: Xu et al. Dynamic Semantic Graph Construction and Reasoning for Explainable Multi-hop Science Question Answering. ACL2021.

04/10/2023
Linh Hoai Luu

Paper Reading: Liu et al. WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation. EMNLP2022.

Logo

©RebelsNLU at JAIST

X