Interpretable sentiment-aware transformer-based model for individual log anomaly detection in distributed systems using word-level explanations

Abstract

Monitoring the health and security of large-scale distributed systems relies heavily on log analysis. However, the sheer volume and structural variability of modern log messages challenge traditional anomaly detection methods, which often depend on rigid, predefined parsers. To address this, we developed a parser-free, transformer-based framework that evaluates the semantic polarity of individual log entries to detect anomalies, coupled with a faithful word-level explainability layer.

Our approach leverages the Bidirectional Encoder Representations from Transformers with In-Task Pre-Training and Fine-Tuning (BERT-ITPT-FIT) architecture:

A fundamental challenge in applying Explainable AI (XAI) to Transformer-based text models is the tokenization process. Tokenizers often split domain-specific technical terms into sub-tokens, breaking human readability.

Our core conceptual contribution is the development of a post-hoc reconstruction procedure using SHapley Additive exPlanations (SHAP). This procedure maps sub-token SHAP attributions back into semantically coherent, word-level explanations. The practical value of this framework extends beyond providing system administrators with immediate visual context for a flagged log. By assigning interpretable, word-level importance scores to each prediction, the model moves beyond binary anomaly indicators.

This methodology quantifies the anomaly-related evidence contained within individual log messages. This quantification serves as a foundational intermediate layer for downstream decision-support systems, allowing operational monitoring frameworks to aggregate severity scores and associate them with higher-level entities, such as specific subsystems or infrastructure components.

Publication
Scientific Reports
Andrés Catlán
Andrés Catlán
Ph.D. Student, Industrial Engineering and Operations Research
Rodrigo A. Carrasco
Rodrigo A. Carrasco
Associate Professor & Director of Data and Computing
Gonzalo Ruz
Gonzalo Ruz
Universidad Adolfo Ibáñez

Related