Skip to content

Latest commit

 

History

History
42 lines (38 loc) · 1.12 KB

File metadata and controls

42 lines (38 loc) · 1.12 KB

Language Detection

Here we are Training our Model Using ML.

We got 17 Languages :

1-> English-> 1382
2-> French-> 1007
3-> Spanish-> 816
4-> Portugeese-> 736
5-> Italian-> 694
6-> Russian-> 688
7-> Sweedish-> 673
8-> Malayalam-> 591
9-> Dutch-> 542
10-> Arabic-> 532
11-> Turkish-> 471
12-> German-> 465
13-> Tamil-> 464
14-> Danish-> 424
15-> Kannada-> 366
16-> Greek-> 358
17-> Hindi-> 62

The Dataset used in Training and Testing of your Model is taken from Kaggle.
Link : Click here

The Models Used to Train are :

KNeighborsClassifier
RandomForestClassifier
ExtraTreesClassifier
MultinomialNB
GaussianNB
DecisionTreeClassifier
ExtraTreeClassifier
LinearSVC
SVC
LogisticRegressionCV
SGDClassifier
RidgeClassifierCV

The Highest Acuracy we got is 0.98 in Multinomial

A paper on this was published in the OCIT 2023. To view the paper click here