**Technical Test for Placement**

## Abstract

**Technical Test for Placement **management report in data mining.With the growth of voluminous amount of data in educational institutes’, the need is to mine the large dataset to produce some useful information out of it. In this research we focused on to form a decision support system for the educational institutes’ which can help them to know about the placement possibility of students. Our research is not limited to find out placement possibility but we did multi-level analysis on student performance dataset which will predict that what level of interview process a student is likely to pass. For this we have applied Naïve Bayes and Improved Naïve Bayes which is integrated with relief feature selection technique to obtain the prediction. Data analysis was done using NetBeans and WEKA. For this our proposed technique gave better accuracy than existing naïve Bayes which was 84.7% and naïve Bayes gave 80.96% accuracy.

## INTRODUCTION

Placements are considered to be very important for each and every college. The basic success of the college is measured by the campus placement of the students. Every student takes admission to the colleges by seeing the percentage of placements in the college. Hence, in this regard the approach is about the prediction and analyses for the placement necessity in the colleges that helps to build the colleges as well as students to improve their placements. The model is built by using the data mining techniques. The algorithms used for building the model are “Fuzzy logic” and “K nearest neighbor”. “Fuzzy logic is a logic system for reasoning that is approximate rather than exact”. The “K Nearest Neighbor (KNN) is a standard classification algorithm that collects all available cases and classifies the new cases based on the distance measures”. The efficiency/accuracy of each model is visualized and tested and based on the performance analysis, each model results are discussed.

**Introduction to Data Mining:**

Data Mining is the process of extracting useful information from large scale dataset. In other words, Data Mining is the process of mining knowledge from structured and unstructured data. It is also known as knowledge discovery process from large unstructured data. Data Mining is the imperative step in the process of knowledge discovery (KDD). The following are the various steps involved in the knowledge discovery process:

- Cleaning the Data set: Here, the process is to remove the noise and inconsistent data.
- Integration of Data: Here, multiple data sources are integrated.
- Selection of Data: From the database, the data relevant to the task are retrieved in this step
- Transformation of Data: It is a process to perform aggregation or summary tasks, i.e., data can be transformed into the forms which are appropriate for KDD.
- Mining the Data: In this stage, to extract useful data patterns various intelligent methods are applied.

**Introduction to Educational Data Mining:**

The uses of Data mining approaches in the education atmosphere are called as Educational Data Mining (EDM). EDM is defined by the International Educational Data Mining Society as “an unindustrialized discipline, concerned with growing methods for exploring the sole types of data that come from educational settings and using those methods to better understand scholars/students and the settings which they acquire in”.

## LITERATURE REVIEW

To find out the importance of data mining in education, here is an insight of related work which has been done by many researchers. Dijana, Mario and Milićperformed cluster analysis on set of students and found out cluster which was having higher of students who were having good score in subject like mathematics and English in high school. They were found to be female and not the males so they broke the hypothesis stating that guys can perform better. Hence they proved that such systems can be formed with clustering. Maryam Zaffar, Hashmani and Savita did performance analysis of various feature selection algorithms and classifiers and compared their results on basis of recall, precision and fmeasure. It was found out that random forest when embedded with principal component analysis gives better results than other techniques. Ashok MV and Apoorva worked on to form placement prediction system which is based on students’ overall percentage and their skill set. They proposed an algorithm for this which was compared with decision tree, naïve bayes and neural network.

The proposed algorithm gave better accuracy than the existing ones. Animesh, Vignesh, BysaniPruthvi and Naini formed a system to assist placement office and students to know where they stand. They took dataset consisting of students’ details. They applied KNN, logistic regression and support vector machine. KNN classifier gave better accuracy than logistic regression and support vector machine. MangasuliSheetal and Savitaformed a placement prediction system using fuzzy logic and KNN. The two techniques were compared and it was found that KNN gives more accuracy than fuzzy inference system. Keno C, Dumlao, Melvin and Shaneth formed a system to find out the unemployability rate of countries under Association of south east Asian nations and found out that Philippines has more unemployed people. They applied naïve bayes, J48, SimpleCart, Logistic Regression and Chaid but logistic regression gave high accuracy and low error rate. Getaneh and Dr. Sreenivasarao collaborated to form a system in which students are placed in different university departments based on their scores in entrance exam.

For this applied naïve bayes, j48 and random forest and predicted that such systems can be formed using j48 as it gave higher accuracy. Anupam and soumya k. collaborated to evaluate the relation between poor student results and teaching quality. In this authors applied apriori to find it out. Rules formed from association rule mining broke the hypothesis that the reason for poor results is poor student quality and it was determined that there are other factors which affect the student performance. Liang, Huang, Qing, Yunheng and Lang worked on to find out which student isextraversion or introversion. For this used usedsci-kit and applied naïve bayes,classification and regression trees, linear SVM. And it was predicted that students who pay more attention online are likely to be introversion and others are extroversion. For this linear SVM performed better than the other two by giving high accuracy. Larian and Muesser found out whether the students access online material available to them on last minute or they procrastinate online submission.

## System Configuration:

**H/W System Configuration:-**

Processor : Pentium IV

Speed : 1 Ghz

RAM : 512 MB (min)

Hard Disk : 20GB

Keyboard : Standard Keyboard

Mouse : Two or Three Button Mouse

Monitor : LCD/LED Monitor

**S/W System Configuration:-**

Operating System : Windows XP/7

Programming Language : Java/J2EE

Software Version : JDK 1.7 or above

Database : MYSQL

## Conclusion

**Technical Test for Placement management report in data mining.**The campus placement activity is very much important as institution point of view as well as student point of view. In this regard to improve the student’s performance, a work has been analysed and predicted using the algorithms Fuzzy logic and the KNN algorithm to validate the approaches. The algorithms are applied on the data set and attributes used to build the model. The accuracy obtained after analysis for KNN is 97.33% and for the Fuzzy logic is 92.67%. Hence, from the above said analysis and prediction it would be better if the KNN is used to predict the placement results.