Mines_Detection_Using_SONAR_Data

Detection of Mines/Rock using SONAR Data. Accuracy has been compared among various classification models

This project aims to classify the obstacle under sea water as either Rock OR Mines.

During the Russo-Japanese War of 1904–1905, two mines blew up when the Petropavlovsk struck them near Port Arthur, sending the holed vessel to the bottom and killing most of his crew in the process. This show us that why detecting mines under sea water, with good accuracy, is crucial to Navy of any country.

1. Data

I have used SONAR data available publicly. This dataset contains 60 numerical features.

1.1 Data Pre-Processing

This dataset does not contains any missing value and has all the features in numeric form. So, we just only need to find the relevant features. I have used principal component analysis (pca) for removing correlation among the features and reducing the dimension.

Below plot shows the variation of explained variance with the number of componets of pca

2. Model Selection

I have compared the accuracy with the number of pca components for LinearRegression, LogisticRegression and RandomForestClassifier model.

2.1 LinearRegression

Below plot shows the variation of accuracy of test data with number of pca components for LinearRegression model.

2.2 LogisticRegression

Below plot shows the variation of accuracy of test data with number of pca components for LogisticRegression model.

2.3 RandomForestClassifier

Below plot shows the variation of accuracy of test data with number of pca components for RandomForestClassifier model.

3. Conclusion

LinearRegression has the worst performance among the above models.
There is one intresting observayion from the above accuracy Vs number of pca features graph, RandomForestClassifier (an ensemble model) outperform LogisticRegrssion when the number of pca components are low. However, both these models have nearly the same performance when number of pca components are increased.
When number of pca components are in range 30 - 40, all the models failed miserably to perform on generalized data.
Taking 8-12 pca features on RandomForestClassifier will give the best performance on generalised data.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Classification_Code.py		Classification_Code.py
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mines_Detection_Using_SONAR_Data

1. Data

1.1 Data Pre-Processing

2. Model Selection

2.1 LinearRegression

2.2 LogisticRegression

2.3 RandomForestClassifier

3. Conclusion

About

Releases

Packages

Languages

License

tomar840/Mines_Detection_Using_SONAR_Data

Folders and files

Latest commit

History

Repository files navigation

Mines_Detection_Using_SONAR_Data

1. Data

1.1 Data Pre-Processing

2. Model Selection

2.1 LinearRegression

2.2 LogisticRegression

2.3 RandomForestClassifier

3. Conclusion

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages