4.2 Article

An Approach for Analyzing Unstructured Text Data Using Topic Modeling Techniques for Efficient Information Extraction

期刊

NEW GENERATION COMPUTING
卷 -, 期 -, 页码 -

出版社

SPRINGER
DOI: 10.1007/s00354-023-00230-5

关键词

Information Retrieval; Information Extraction; Natural language processing (NLP); Topic modeling; Latent Dirichlet Allocation (LDA); Coherence score

向作者/读者索取更多资源

This research proposes a framework using topic modeling techniques to extract legal information from unstructured legal judgments in the Indian judicial system. The proposed framework aims to automate judgment analysis and quickly examine a large number of judgments. The framework is built on Latent Dirichlet Allocation and categorizes legal judgments into extracted topic groups. The framework has been successfully applied to different batch sizes of legal judgments and can be used to measure legal judgment similarity.
Topic modeling techniques are popularly used for document clustering, large-scale text analysis, information extraction from unstructured text documents, feature selection from large corpus, and various recommendation systems. This work suggested a framework using topic modeling techniques for legal information extraction from the Indian judicial system's unstructured legal judgments. The suggested approach aims to eliminate time-consuming manual judgment analysis in favor of automated judgment analysis that can quickly examine large number of judgments in reduced time span. In this work, we have experimented with different topic modeling methodologies for information extraction. The proposed framework is built on the Latent Dirichlet Allocation, to categorize legal judgments into extracted topic groups. Indian Supreme Court judgements are considered for the experimental setting. The three main elements of the framework are pre-processing, applying the topic model, and model evaluation using a coherence score metric. The framework was successfully applied to a corpus size of 100, 500, and 1000 legal judgments in batches. The proposed framework is used to measure legal judgment similarity to demonstrate its quantitative evaluation. In the future scope, various legal tasks that can benefit from the proposed framework for performance improvement are suggested.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.2
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据