XAI Unit

Activities ▸ XAI Unit

We study how to make NLP models explainable, robust, and rationale.

Members

Chau Nguyen (Nguyen Lab) (Unit leader)
Kenshiro Tanaka
Yoshihiro Sakai
Naoya Inoue
Yufeng Zhao
Tien Dang-Huu
Akira Ishii
Mariko Kato

Meetings Log

2024/4-2025/3

Date	Presenter	Topic
2025/03/19	Kenshiro Tanaka	Paper reading: Presentation of 3 papers from NLP2025
2024/12/11	Naoya Inoue	Paper reading: The Super Weight in Large Language Models https://arxiv.org/abs/2411.07191
2024/11/27	Tien Dang Huu	Progress Report: Unlearning
2024/11/20	Naoya Inoue	Paper reading: Lennart et al. Truth is Universal: Robust Detection of Lies in LLMs https://arxiv.org/abs/2407.12831
2024/07/03	Chau Nguyen	Paper reading: Langedijk et al. DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers. NAACL2024.
2024/06/26	Naoya Inoue	Paper reading: Huh et al. The Platonic Representation Hypothesis. ICML2024.
2024/06/19	Mariko Kato	Paper reading: Bansal et al. Revisiting Model Stitching to Compare Neural Representations. NeurIPS2021.
2024/06/12	Kenshiro Tanaka	Progress report: Metacognitive reasoning
2024/05/29	Tien Dang Huu	Progress report: Machine Unlearning
2024/05/22	Naoya Inoue	Paper discussion: Hernandez et al. Linearity of Relation Decoding in Transformer Language Models. ICLR2024.
2024/05/15	Yoshihiro Sakai	Paper discussion: Kossen et al. In-Context Learning Learns Label Relationships but Is Not Conventional Learning. ICLR2024.
2024/05/08	Yufeng Zhao	Paper reading: Du et al. Understanding Emergent Abilities of Language Models from the Loss Perspective. arXiv2024.
2024/04/24	Tien Dang Huu	Paper reading: Li et al. 2024. The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning. arXiv2024. + Progress report
2024/04/17	Chau Nguyen	Paper reading: Ladhak et al. When Do Pre-Training Biases Propagate to Downstream Tasks? A Case Study in Text Summarization. EACL2023.
2024/04/10	Kenshiro Tanaka	Paper reading: Li et al. Inference-Time Intervention: Eliciting Truthful Answers from a Language Model. NeurIPS2023.
2024/04/03	Naoya Inoue	Paper reading: Dai et al. Knowledge Neurons in Pretrained Transformers. ACL2022.

2023/4-2024/3

Date	Presenter	Topic
2024/03/27	Kenshiro Tanaka	Paper reading: NLP2024 papers (Part 2)
2024/03/20	Yufeng Zhao	Paper reading: NLP2024 papers (Part 1)
2024/03/01	Daichi Haraguchi	Paper reading: Yamato et al. Evolution of metamemory based on self-reference to own memory in artificial neural network with neuromodulation. Nature2022.
2024/02/28	Kenshiro Tanaka	Progress report: Mid-term presentation
2024/02/21	Tien Dang Huu	Progress report: ACL submission
2024/01/31	Chau Nguyen	Paper Reading: Lanham et al. Measuring Faithfulness in Chain-of-Thought Reasoning. arXiv2023.
2024/01/24	Naoya Inoue	Progress report: Back-off LMKB
2024/01/10	Yufeng Zhao	Paper Reading: Wang et al. Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning. EMNLP2023.
12/20/2023	Daichi Haraguchi	Paper Reading: Hagstrom et al. The Effect of Scaling, Retrieval Augmentation and Form on the Factual Consistency of Language Models EMNLP2023.
12/13/2023	Tien Dang Huu	Progress report: Machine Unlearning
12/06/2023	Chau Nguyen	Paper Reading: Lampinen et al. Can language models learn from explanations in context? EMNLP2022.
11/08/2023	Yufeng Zhao	Survey report: Model Interpolation & Vectorial Model Editing.
11/01/2023	Yoshihiro Sakai	Paper Reading: Wu et al. Do PLMs Know and Understand Ontological Knowledge?. ACL2023.
10/26/2023	Daichi Haraguchi	Paper Reading: Meng et al. Locating and Editing Factual Associations in GPT. NeurIPS 2023.
10/12/2023	Tien Dang Huu	Paper Reading: Nguyen-Duc et al. Class based Influence Functions for Error Detection. ACL2023.
10/05/2023	Chau Nguyen	Paper Reading: Atanasova et al. Faithfulness Tests for Natural Language Explanations. ACL2023.
09/21/2023	Kenshiro Tanaka	Paper Reading: Besta et al. Graph of Thoughts: Solving Elaborate Problems with Large Language Models. arxiv2023.
09/14/2023	Naoya Inoue	Paper Reading: Hong et al. Faithful Question Answering with Monte-Carlo Planning. ACL2023.
09/07/2023	Daichi Haraguchi	Paper Reading: 坂井ら. 未知の知識に対する事前学習済み言語モデルが持つ推論能力の調査. 2023-NL-257.
08/31/2023	Chau Nguyen	Paper Reading: Hu et al. LoRA: Low-Rank Adaptation of Large Language Models. arxiv 2023.
08/24/2023	Daichi Haraguchi	Paper Reading: Wei et al. Simple synthetic data reduces sycophancy in large language models. arxiv 2023.
08/18/2023	Linh Hoai Luu	Practice Talk: Linh’s MS thesis defense
08/04/2023	Naoya Inoue	Paper Reading: Hao et al. Reasoning with Language Model is Planning with World Model. arxiv 2023.
07/21/2023	Naoya Inoue	Paper Reading: Wong et al. From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought. arxiv 2023.
07/06/2023	Chau Nguyen	Paper Reading: Stacey et al. Logical Reasoning with Span-Level Predictions for Interpretable and Robust NLI Models. EMNLP2022.
06/30/2023	Daichi Haraguchi	Paper Reading: Dziri et al. Faith and Fate: Limits of Transformers on Compositionality. arxiv 2023.
06/23/2023	Kenshiro Tanaka	Paper Reading: Jin et al. Evidence of Meaning in Language Models Trained on Programs. arxiv 2023.
06/09/2023	Naoya Inoue	Presentation: Large Language Models as Real-world Planner
06/02/2023	Chau Nguyen	Paper Reading: Yao et al. Tree of Thoughts: Deliberate Problem Solving with Large Language Models. arxiv 2023.
05/26/2023	Daichi Haraguchi	Paper Reading: Dai et al. Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers. ACL2023.
05/19/2023	Linh Hoai Luu	Paper Reading: Li et al. Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates. EMNLP2022.
05/12/2023	Naoya Inoue	Paper Reading: Zhou et al. Context-faithful Prompting for Large Language Models. arxiv 2023.
04/28/2023	Daichi Haraguchi	Paper Reading: Madaan et al. Self-Refine: Iterative Refinement with Self-Feedback. arxiv 2023.
04/21/2023	Chau Nguyen	Paper Reading: Xu et al. Dynamic Semantic Graph Construction and Reasoning for Explainable Multi-hop Science Question Answering. ACL2021.
04/10/2023	Linh Hoai Luu	Paper Reading: Liu et al. WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation. EMNLP2022.