Artificial Intelligence in Tax Law: Case Classification and Topic Modelling Using Large Language Models

Faisal Labib Zulfiqar; Ayu Rosalia

doi:10.53697/jkomitek.v5i1.2815

Authors

Faisal Labib Zulfiqar Ministry of Finance, Indonesia
Ayu Rosalia Universitas Mercubuana, Indonesia

DOI:

https://doi.org/10.53697/jkomitek.v5i1.2815

Keywords:

Tax Dispute Analytics, Large Language Models (LLM), Case Complexity Classification, Topic Modelling, Judicial Decision Support

Abstract

This study explored the application of the GPT API framework to automate case complexity classification and topic modelling in Indonesian Tax Court decisions. Using a dataset of 5,000 anonymized tax dispute summaries, we designed a prompt-based classification pipeline supported by expert-labelled benchmarks. The case complexity classification task, categorizing cases into low and high complexity. The GPT-based model achieving 87% precision. This indicates the model’s practical ability to simulate legal judgment in triaging case difficulty. Simultaneously, topic modelling was performed to identify key dispute themes across grouped cases. The three most frequently recurring themes were: (1) input VAT correction errors in VAT disputes (26.1%), (2) net income adjustments in income tax cases (9.1%), and (3) customs valuation issues in import transactions (17.4%). These model-derived clusters aligned closely with expert taxonomies and provided useful summaries of dispute patterns over time. The methodology built using Python, Google Colab, and the OpenAI GPT API. By structuring Indonesia’s growing corpus of tax litigation into actionable categories, this approach strengthens the country’s digital justice transformation. It enables better resource allocation and faster dispute resolution.

References

Aastha Budhiraja. (2024). Machine Learning Infused Approach for Advancing Legal Predictive Analytics. Communications on Applied Nonlinear Analysis, 31(8s), 352–364. https://doi.org/10.52783/cana.v31.1506

Afiyati, R., Sudarsono, Negara, T. A. S., & Koeswahyono, I. (2022). Tax dispute settlement mediation arrangements in the future tax court. International Journal of Research in Business and Social Science (2147- 4478), 11(5), 503–511. https://doi.org/10.20525/ijrbs.v11i5.1867

Andriati, S. L., Rizki, I. K., & Malian, A. N. B. M. (2024). Justice on Trial: How Artificial Intelligence is Reshaping Judicial Decision-Making. Journal of Indonesian Legal Studies, 9(2). https://doi.org/10.15294/jils.v9i2.13683

Anyebe, P. A. (2020). Tax Disputes Resolution In Nigeria: Going Beyound The Traditional Court And Administrative Resolution System. Advances in Social Sciences Research Journal, 6(12), 236–252. https://doi.org/10.14738/assrj.612.7574

Costa, Y. D. R., Oliveira, H., Nogueira, V., Massa, L., Yang, X., Barbosa, A., Oliveira, K., & Vieira, T. (2025). Automating petition classification in Brazil’s legal system: a two-step deep learning approach. Artificial Intelligence and Law, 33(1), 227–251. https://doi.org/10.1007/s10506-023-09385-4

Ding, Z. (2024). A Study on the Multiple Dispute Resolution Mechanisms of Systemic Jurisprudence in the Context of Big Data. Applied Mathematics and Nonlinear Sciences, 9(1). https://doi.org/10.2478/amns-2024-0859

Dokumacı, M. (2024). AI-Driven Econometric Models for Legal Issues. Human Computer Interaction, 8(1), 137. https://doi.org/10.62802/btfvze98

Elliot, M., & Thomas, R. (2020). The Effectiveness and Impact of Judicial Review. In Public Law. Oxford University Press.

Han, W., Shen, J., Liu, Y., Shi, Z., Xu, J., Hu, F., Chen, H., Gong, Y., Yu, X., Wang, H., Liu, Z., Yang, Y., Shi, T., & Ge, M. (2024). LegalAsst: Human-centered and AI-empowered machine to enhance court productivity and legal assistance. Information Sciences, 679, 121052. https://doi.org/10.1016/j.ins.2024.121052

Imran, A. S., Hodnefjeld, H., Kastrati, Z., Fatima, N., Daudpota, S. M., & Wani, M. A. (2023). Classifying European Court of Human Rights Cases Using Transformer-Based Techniques. IEEE Access, 11, 55664–55676. https://doi.org/10.1109/ACCESS.2023.3279034

Sari, I., Kosasih, R., & Indarti, D. (2024). Predicting levels of legal case difficulties using machine learning. IAES International Journal of Artificial Intelligence (IJ-AI), 13(4), 4364. https://doi.org/10.11591/ijai.v13.i4.pp4364-4371

Siino, M., Falco, M., Croce, D., & Rosso, P. (2025). Exploring LLMs Applications in Law: A Literature Review on Current Legal NLP Approaches. IEEE Access, 13, 18253–18276. https://doi.org/10.1109/ACCESS.2025.3533217

Sukanya, G., & Priyadarshini, J. (2024). Hybrid CNN: An Empirical Analysis of Machine Learning Models for Predicting Legal Judgments. International Journal of Advanced Computer Science and Applications, 15(7). https://doi.org/10.14569/IJACSA.2024.01507124

Zhang, L., Ma, Y., Herman, D., & Chen, J. (2022). Testing calibration of phenotyping models using positive-only electronic health record data. Biostatistics, 23(3), 844–859. https://doi.org/10.1093/biostatistics/kxab003

Zulfiqar, F. L., Ulupui, I. G. K. A., & Respati, D. K. (2023). A qualitative analysis on transfer pricing tax audit performance in Indonesia. AKURASI: Jurnal Riset Akuntansi Dan Keuangan, 5(1), 73–84. https://doi.org/10.36407/akurasi.v5i1.805