Artificial Intelligence in Tax Law: Case Classification and Topic Modelling Using Large Language Models
DOI:
https://doi.org/10.53697/jkomitek.v5i1.2815Keywords:
Tax Dispute Analytics, Large Language Models (LLM), Case Complexity Classification, Topic Modelling, Judicial Decision SupportAbstract
This study explored the application of the GPT API framework to automate case complexity classification and topic modelling in Indonesian Tax Court decisions. Using a dataset of 5,000 anonymized tax dispute summaries, we designed a prompt-based classification pipeline supported by expert-labelled benchmarks. The case complexity classification task, categorizing cases into low and high complexity. The GPT-based model achieving 87% precision. This indicates the model’s practical ability to simulate legal judgment in triaging case difficulty. Simultaneously, topic modelling was performed to identify key dispute themes across grouped cases. The three most frequently recurring themes were: (1) input VAT correction errors in VAT disputes (26.1%), (2) net income adjustments in income tax cases (9.1%), and (3) customs valuation issues in import transactions (17.4%). These model-derived clusters aligned closely with expert taxonomies and provided useful summaries of dispute patterns over time. The methodology built using Python, Google Colab, and the OpenAI GPT API. By structuring Indonesia’s growing corpus of tax litigation into actionable categories, this approach strengthens the country’s digital justice transformation. It enables better resource allocation and faster dispute resolution.
References
Aastha Budhiraja. (2024). Machine Learning Infused Approach for Advancing Legal Predictive Analytics. Communications on Applied Nonlinear Analysis, 31(8s), 352–364. https://doi.org/10.52783/cana.v31.1506
Afiyati, R., Sudarsono, Negara, T. A. S., & Koeswahyono, I. (2022). Tax dispute settlement mediation arrangements in the future tax court. International Journal of Research in Business and Social Science (2147- 4478), 11(5), 503–511. https://doi.org/10.20525/ijrbs.v11i5.1867
Andriati, S. L., Rizki, I. K., & Malian, A. N. B. M. (2024). Justice on Trial: How Artificial Intelligence is Reshaping Judicial Decision-Making. Journal of Indonesian Legal Studies, 9(2). https://doi.org/10.15294/jils.v9i2.13683
Anyebe, P. A. (2020). Tax Disputes Resolution In Nigeria: Going Beyound The Traditional Court And Administrative Resolution System. Advances in Social Sciences Research Journal, 6(12), 236–252. https://doi.org/10.14738/assrj.612.7574
Costa, Y. D. R., Oliveira, H., Nogueira, V., Massa, L., Yang, X., Barbosa, A., Oliveira, K., & Vieira, T. (2025). Automating petition classification in Brazil’s legal system: a two-step deep learning approach. Artificial Intelligence and Law, 33(1), 227–251. https://doi.org/10.1007/s10506-023-09385-4
Ding, Z. (2024). A Study on the Multiple Dispute Resolution Mechanisms of Systemic Jurisprudence in the Context of Big Data. Applied Mathematics and Nonlinear Sciences, 9(1). https://doi.org/10.2478/amns-2024-0859
Dokumacı, M. (2024). AI-Driven Econometric Models for Legal Issues. Human Computer Interaction, 8(1), 137. https://doi.org/10.62802/btfvze98
Elliot, M., & Thomas, R. (2020). The Effectiveness and Impact of Judicial Review. In Public Law. Oxford University Press.
Han, W., Shen, J., Liu, Y., Shi, Z., Xu, J., Hu, F., Chen, H., Gong, Y., Yu, X., Wang, H., Liu, Z., Yang, Y., Shi, T., & Ge, M. (2024). LegalAsst: Human-centered and AI-empowered machine to enhance court productivity and legal assistance. Information Sciences, 679, 121052. https://doi.org/10.1016/j.ins.2024.121052
Imran, A. S., Hodnefjeld, H., Kastrati, Z., Fatima, N., Daudpota, S. M., & Wani, M. A. (2023). Classifying European Court of Human Rights Cases Using Transformer-Based Techniques. IEEE Access, 11, 55664–55676. https://doi.org/10.1109/ACCESS.2023.3279034
Sari, I., Kosasih, R., & Indarti, D. (2024). Predicting levels of legal case difficulties using machine learning. IAES International Journal of Artificial Intelligence (IJ-AI), 13(4), 4364. https://doi.org/10.11591/ijai.v13.i4.pp4364-4371
Siino, M., Falco, M., Croce, D., & Rosso, P. (2025). Exploring LLMs Applications in Law: A Literature Review on Current Legal NLP Approaches. IEEE Access, 13, 18253–18276. https://doi.org/10.1109/ACCESS.2025.3533217
Sukanya, G., & Priyadarshini, J. (2024). Hybrid CNN: An Empirical Analysis of Machine Learning Models for Predicting Legal Judgments. International Journal of Advanced Computer Science and Applications, 15(7). https://doi.org/10.14569/IJACSA.2024.01507124
Zhang, L., Ma, Y., Herman, D., & Chen, J. (2022). Testing calibration of phenotyping models using positive-only electronic health record data. Biostatistics, 23(3), 844–859. https://doi.org/10.1093/biostatistics/kxab003
Zulfiqar, F. L., Ulupui, I. G. K. A., & Respati, D. K. (2023). A qualitative analysis on transfer pricing tax audit performance in Indonesia. AKURASI: Jurnal Riset Akuntansi Dan Keuangan, 5(1), 73–84. https://doi.org/10.36407/akurasi.v5i1.805
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Faisal Labib Zulfiqar, Ayu Rosalia

This work is licensed under a Creative Commons Attribution 4.0 International License.



