Cluster Analysis Using the Hierarki Method For Grouping Sub-Districts in The District Steps Based on Health Indicators

Received February 15, 2020 Revised March 14, 2020 Accepted April 24 , 2020 Cluster analysis is a method used to group objects based on similarity of characteristics they have. Cluster analysis using the hierarchy method is a method with a grouping process that is used in stages. Health is a condition where a person is not sick, has no complaints, and can carry out daily activities. To find out information about the level of health in Langkat Regency, it is necessary to use the grouping method. The grouping was carried out in 23 districts in Langkat Regency. The purpose of this study is to classify sub-districts in Langkat Regency which have similar characteristics based on health indicators through the square euclidian distance is used to measure the similarity between object pairs and the ward method. From the results of cluster analysis using the ward method.


INTRODUCTION
Human health is a basic need that is multi-nature so it is important to always pay attention, because health is the first and main asset in human survival. Health is also one part of welfare, because people who are more successful in human development are those who have a high level of health.
To describe the health condition of the community in Langkat Regency, indicators of health status were used including morbidity, birth attendants and life expectancy.
In the 2017 National Social Survey (SENSUS), residents of Langkat District who experienced health complaints amounted to 15.65 percent, 46.24 percent for outpatient treatment to overcome these health complaints, while 53.76 percent did not seek outpatient treatment. This is because, because they do not have medical expenses (2.11 percent), do not have transportation costs (0.07 percent), they treat themselves (79.25 percent), feel unnecessary (17.61 percent) and other reasons (0.96 percent) (BPS, 2018). The low awareness, willingness and ability to live a healthy life in Langkat District has resulted in a decreased level of health in Langkat District, this is evidenced by the large number of people experiencing health complaints. Based on the description.

RESEARCH METHODE Cluster Analysis
Cluster analysis is a multivariate analysis (many variables) that functions to group objects or several variables based on their characteristics. In addition, cluster analysis also aims to maximize the similarity of objects in the cluster while also maximizing differences between clusters (Hair, 2009).
The process of processing data so that data sets can be formed into clusters using cluster analysis is as follows (Santoso, 2015).

Setting the Distance Between Data Sizes
The measure used in measuring the similarity between data in cluster analysis is Square euclidian distance (squared euclidean distance).
( ) The distance between objects can be written in the form of a matrix. The matrix is a collection of numbers (elements or etri in the form of real or complex items) arranged in rows and columns to form a rectangle that is mxn size enclosed in square brackets ( Where is the distance between objects to i and to j for each i, j = 1, 2, ..., n.

Conduct Data Standardization Process
In the process of standardizing data, the first thing to consider is whether the data unit has outliers or different data on a large scale (outlier) among the research variables. Detection of outlier data can be done by determining the boundary value that is used as part of outlier data, by changing the score from the initial or raw data into standardized score (z-score), with the result of standard deviation of one and means (average) zero.
Clustering Process Clustering process is a process carried out by two methods, namely the hierarchical and non-hierarchical methods.

a. Hierarchy Method
Hierarchical method is a method that is done in stages. In this method will form a certain stage as in the tree structure and can be produced in the form of a dendogram. Dendogram is a visual representation of the stages of the cluster analysis process that is formed which produces the value of the distance coefficient at each stage. The result in the form of a number to the right of the dendogram is the object of research, because there is a line that connects these objects with other objects to form a cluster (Simamora, 2005).
Single linkage method (the closest distance) or a single link can be done by grouping data based on the shortest distance (Rencher, 2002). Average Link Method (Average Linkage Method) Average linkage method is a method that is done by grouping data based on the average distance between the whole data (Rencher, 2002).
Ward's method is clustering by maximizing the similarity in one cluster and using a complete calculation. At each stage, the distance between the two clusters that can be formed is Sum of Square Error (SSE) in the two smallest clusters combined (Rencher, 2002).
If A, B and AB are clusters, then the sum of the squares of errors in the cluster are: The ward method can join two clusters and which can minimize the increase Sum of Square Error (SSE). Defined as follows: It can be seen that the increase in the AB I in equation (2.9) has the following equivalent form (Sukmawati, 2017): SSE will be zero if A only consists of yi and B consists of yj. In equations (2.9) and equation (2.10) produce equations with formulas that will be used in calculating the distance between objects using the ward method as follows: Centroid method also called the center point method, where the distance between clusters in the centroid method is the distance between centroids. If a new cluster formation occurs, there will be a recalculation (Rencher, 2002). After two clusters and join, center of the cluster can be given as follows: n y n y y n n (2.13) b. Non-Hierarchy Method The non-mathematical method is also called the method k-means. This method is not the same as the hierarchical method, because the non-hierarchical method starts by determining in advance a numberthe cluster desired start, then the results of these observational objects merge and form the cluster.

Data Sources and Research Variables
In this study, the data used are secondary data obtained from the Langkat District Health Office in 2018. This study uses three indicators of community health degrees with six variables. Data analysis 1.
Gather references regarding cluster analysis and health indicators 2.
Collecting data on health indicators obtained from the District Health Office of Langkat 3.
Describe health indicator data 4.
Using data standardization 5.
Determine a procedure in cluster analysis In this study using a cluster analysis of the hierarchical method using the ward method 6.
Conduct cluster analysis results 7.
After obtaining the results of the cluster analysis, the next step is to interpret the results of the cluster formed 8.
Conclusions and recommendations.

RESULT AND ANALYSIS Data Standardization
In the process of standardizing data, calculations are performed the z-score obtained by using equation This process continues until the entire data produces the same unit scale.

Calculating Data Between Sizes
This process is done after standardizing the data, the method used to calculate the size between objects is the square euclidian distance (squared euclidean distance) with the equation formula (2.1). The following is a sample by calculating the size between Bahorok sub-district and Serapit sub-district (objects 1 and 2).

5.763588514
This process continues until knowing the overall size of the distance between objects.
Based on the calculation of the size between 23 districts using square euclidian distance (squared euclidean distance), it is known that the closest pair of objects is the Salapian district and the Sawit Seberang district with the closest distance of 0.396188.

Cluster Analysis Process Using the Ward Method
The clustering process in the hierarchical method with the ward method is carried out using the two closest objects (districts), where the distance is the closest between the distances of 23 objects (districts) that exist. For example Bahorok sub-district and Serapit sub-district by using equation (2.11).

0.198
Salapian S Seberang SSE The above process is carried out to count between the two clusters formed, with each cluster consisting of one object. Clustering method that starts from two or more closest objects into one cluster, then is done by calculating the distance of a cluster with a new object, this process is carried out in stages.
The cluster analysis process using the ward method produces 3 clusters that are formed, including; Cluster 1 shows A consisting of Bahorok, Kuala, Binjai, Wampu, Hinai, Padang Tualang, Batang Serangan, Gebang, Pangkalan Susu and Besitang sub-districts. Cluster 2, for example B, consists of Serapit, Salapian, Kutambaru, Palm Overseas, Sei Lepan, West Brandan and Pematang Jaya districts.   Table 1, it can be concluded that the amount the cluster generated by using the ward method for grouping districts in Langkat Regency as many as 3 the cluster based on health status, i.e. the cluster with a high degree of health, the cluster with moderate health status, and the cluster with a low degree of health.
Cluster with a high degree of health found at the cluster 2, this can be seen in variables