Comparative Analysis Of K Means And K Medoids Algorithms In Determining Social Assistance In

: The government's efforts to tackle poverty are by issuing several programs that can help meet people's needs. However, in practice, it is not uncommon for the distribution of aid or other government programs to not be on target, because there are no clear procedures and calculations in determining which people are entitled to receive aid. Therefore, it is necessary to have a calculation mechanism that involves data on the demographic characteristics of the community using clustering. In this clustering process, there are 2 algorithms that are often used, namely K-medoid and K-means. The research aims to classify which communities are a priority for receiving assistance and which are not a priority. In order to get more accurate results, this research also tested the two clustering algorithms to get the best algorithm and the best number of clusters based on the dataset owned by looking at the Davies Bouldin Index (DBI). This research concluded that with the data set the best algorithm was K-medoids and the number of clusters was 2 with a DBI value of 1.332. Then, in the results of the clustering carried out from a total of 1031 data analyzed, it was found that 396 residents were eligible or made priority for receiving assistance and 635 residents who were not yet included in the priority list for receiving assistance.


Introduction
Economic development is a series of efforts made to develop economic activities to improve people's lives.Efforts to reduce poverty have long been carried out by the government by issuing several programs that can help meet the needs of the community, both those that are directly consumptive, in the field of education, as well as those issued by the government in the form of capital with the aim of supporting the development of Micro, Small and Medium Enterprises (MSMEs) so that the community has a business to improve the economic level of the community itself.These programs include the Smart Indonesia Program (PIP) provided through the Smart Indonesia Card (KIP), the Healthy Indonesia Program (PIS) provided through the Healthy Indonesia Card (KIS), the Family Hope Program (PKH), and other programs aimed at helping meet the needs of the poor (Pratiwi, Noorsyarifa and Apsari, 2022).
The distribution of aid or government programs at the regional level is supervised and monitored by the Social Service which is part of a government organization that is the https://penerbitadm.pubmedia.id/index.php/KOMITEKimplementer of policies made by the government at a higher level.However, in practice it is not uncommon for the distribution of assistance or other government programs to be off target, in fact, people who should be entitled to assistance do not get it or vice versa.This happens because there are no clear procedures and calculations in determining which people are entitled to assistance, all are still based on observations and considerations that are potentially subjective in determining which people are more entitled to get and which should not be entitled.Therefore, it is necessary to have a calculation mechanism that involves data on demographic characteristics of the community such as asset ownership status, latest education, etc.Therefore, it is necessary to group or cluster residents based on their economic level by taking into account demographic characteristics so that it can facilitate the local government in this case the Padang Sidimpuan City Social Service in selecting residents who are entitled to government assistance or programs.In response to this, there is a technique that can be used in carrying out the clustering process or grouping quickly based on mathematical calculations, namely Data Mining.Data mining is the process of finding interesting patterns or information in selected data using certain techniques or methods.The techniques, methods, or algorithms in data mining vary greatly (Yuli Mardi, 2019).
Clustering is one of the techniques in data mining, a method used to group data with the same characteristics into the same region and data with different characteristics into other regions.In this clustering process there are 2 algorithms that are often used, namely K-medoid and K-means.
Based on the background that has been described above, the researcher plans to conduct research with the title Comparative analysis of k-means and k-medoids algorithms in determining social assistance at the Padang Sidimpuan City Social Service, North Sumatra.In this study, researchers will conduct economic clustering using the two algorithms, as well as examine which of the two algorithms is the best in performance based on the Davies Bouldin Index (DBI) value.

A. Data collection
Data collection in this study was carried out using the following methods: a. Interview Interviews will be conducted with one employee of the Padang Sidimpuan City Social Service, North Sumatra to get the data needed.b.Documentation The documentation method is carried out to obtain data and information in the form of books, archives, documents, numbers, and images in the form of reports and information that can support research.In this research, researchers will submit requests for archives or citizen data documents to the Social Service to be analyzed with data mining.

B. Research Method KDD Process
In the Knowedge Discovery in Database (KDD) process, the stages that will be carried out in the KDD process in this research: a. Data Section At the data section stage, data selection is carried out that does not need to be included in the dataset so that the data is not too much and the remaining data will actually be a reference in the data mining process.b.Data Pre-Processing The data pre-processing stage is carried out to clean up noise data such as completing empty data or deciding to exclude empty data in the data mining process stage.Pre-processing processes include daltal cleaning to clean data from nois, data integraltion process of merging and equalizing data that may come from different sources or files, data transformation is the stage of transforming data into a format that can be calculated, for example, from non-numbers to numbers.

c. Data Mining
This stage is the core stage in this research, calculations will be carried out using the Rapidminer data mining application with the k-means and k-medoids algorithms and also do the manual version.There are several stages carried out at this stage, namely, with the data set used, it will be tested according to the two algorithms how many clusters are ideal to be formed and used as the number of k or clusters, then the clustering process will be carried out with the number of k or clusters based on the number of clusters and the best algorithm according to the previous test.

Result and Discussion
At this stage, it discusses recognizing the discussion of the data mining process, namely clustering the data that has been obtained during the research using the k-means and k-medoids methods using the RapidMiner data mining application.
Research has been conducted at the social service of Padang Sidimpuan City, North Sumatra.The data that was successfully obtained in the research or which will become a dataset in the data mining process is in the form of 1057 citizen data.The data must go through selection before calculations are carried out in accordance with the steps in the data mining process.At this stage, data is selected that does not need to be included in the dataset to be discarded so that there is not too much data to be processed and the remaining data is truly data that will be used as a reference in the data mining process.
In the existing dataset, data disposal was carried out on column data that was deemed unnecessary, such as employment status of the head of the family, roof quality, floor quality, etc.The reason this data was discarded was because most residents did not answer so that valid information was not obtained in this column.The following displays the data from the selection results that have been carried out: Data Pre-Processing On data sets that are already owned, this stage is done by cleaning noise, filling or cleaning empty data rows, data transformation, namely changing data from uncountable data to data that can be calculated.Meanwhile, data transformation is carried out by initiating information in the form of sentences into numbers so that all the data contained in the dataset are numbers or numeric which can be calculated using the k-means and kmedoids algorithms.There are no provisions or formulas for the data transformation process, so in this research data transformation refers to the quality or value of goods used by residents.So for the lowest value, for example the type of floor for residents of land, it will be given a value of 1 and vice versa, if the type of floor used by residents is marble or the like, it will also be given a high value.Likewise with other types.One of the references used by researchers in terms of data transformation is research from (Fakhrul Gunawan, Rakhmat Umbara and Kasyidi, 2022) with the title "Grouping the Economic Status of Tanjungsari Village Families using the K-Means Clustering Method", in his research data transformation was carried out regarding the type of house used.lived in, where he gave a score of 6 to "own", 5 to "contract/rent", 3 to "borrowed" to 1 to "other".After this process, a total of 1031 data were obtained that were ready for analysis.https://penerbitadm.pubmedia.id/index.php/KOMITEK

Conclusion
Based on the research that has been done, the conclusions of the results of this study are as follows: (a) Based on the clustering results of a total of 1,031 citizen data, it was successfully grouped into 2, namely underprivileged citizens who deserve assistance in cluster 0 as many as 396 walrgal data and citizens who are categorized as malmpu so that they have not or have not become a priority in providing assistance in cluster 1 as many as 635 datal citizens.(b)The results of calculations on Rapidminer and manually using Microsoft Excel show the same results, namely cluster 0 as many as 396 and cluster 1 as many as 635 citizen data.(c) Based on the data set obtained and used in this study, and based on the tests carried out, it is found that the k-medoids algorithm with the number of clusters 2 is still more behind than k-means.(d) The results of clustering citizen data only provide a reference to the authorities to get accurate data related to the distribution of assistance so that the assistance is right on target.:

Figure 1
Figure 1 Dataset Data Selection

Figure 2
Figure 2 Selection Data Results

Figure 3
Figure 3 Data Noise Cleaning Results