PERTANIKA JOURNAL OF SCIENCE AND TECHNOLOGY

 

e-ISSN 2231-8526
ISSN 0128-7680

Home / Special Issue / JST Vol. 33 (S4) 2025 / JST(S)-0694-2025

 

Differences Between Research Log Datasets and Development Field Logs and the Creation of the Complexity Evaluation Index

Hironori Uchida, Keitaro Tominaga, Hideki Itai, Yujie Li and Yoshihisa Nakatoh

Pertanika Journal of Science & Technology, Volume 33, Issue S4, December 2025

DOI: https://doi.org/10.47836/pjst.33.S4.09

Keywords: Anomaly detection, complexity, evaluation index, log generator, system log

Published on: 2025-06-10

In the industrial domain, logs are widely applied in the management and maintenance of software systems to ensure reliability and availability. Furthermore, in the research field, various deep learning methods such as CNNs, LSTMs, and Transformers have been reported to achieve high accuracy in anomaly detection studies. However, there are challenges to their adoption in development fields. One reason is the limited datasets used in research, which lack a comprehensive evaluation for general applicability. To address this, we have prepared metrics to assess the complexity of log datasets necessary for creating a log generator for research purposes. We conducted a comparative study on the complexity of datasets in both research and industrial domains. Our evaluation of log sequence complexity, using frequency of occurrence and the Gini coefficient, showed that industrial logs are more complex across all metrics. This highlights the increased need for datasets close to the industrial domain for research purposes. Our study's findings suggest that a clear metric for dataset complexity can be achieved by converting logs into templates and then into sequences of size 10, evaluated using the Gini coefficient or kurtosis. Future work will involve developing a generator that produces logs close to those found in development environments, using these metrics as target values.

ISSN 0128-7680

e-ISSN 2231-8526

Article ID

JST(S)-0694-2025

Download Full Article PDF

Share this article

Related Articles