Friday, 23 October 2020

Doing Sentiment Analysis using Natural Language Process

Introduction:

 In the field of Artificial Intelligence, the amount of using Natural Language Processing is increasing heavily. Some common applications where NLP is used mostly as follows:

 Text Classification (Spam Detector etc)

 Sentiment Analysis

 Author Recognition

 Machine Translate

 Chatbots

What is Sentiment Analysis?

One of the most common applications in Natural Language Processing is Sentiment Analysis through which we can decide the emotion of a text is written.

As the use of Social Media platforms are growing day by day, as the use of these platforms are getting popular and the more people are getting attached to it,  the need to analyze the content that people shares/posts over here are increasing rapidly. If we consider the volume of data coming through social media, it is really difficult to do this with human power. Therefore, the need for applications that can quickly detect and respond to the positive or negative comments that people write are increasing. In this blog, a baseline model for simple analysis of sentiment will be developed.

First of all, go through the information about the dataset .

Data Set Name: Sentiment Labelled Sentences Data Set

Data Set Source: UCI Machine Learning Libarary

Data Set Info: This dataset was created with user reviews collected via three different websites ( like Amazon, Yelp, IMDb). These comments contain the restaurants, films and product reviews. Each record in the data set is labeled with two different emoticons. These are 1: Positive, 0: Negative.We will create a sentiment analysis model using the data set we have given above.
Let's build a Machine Learning model with the Python using the sklearn and nltk library.
First, let's import the libraries we will use.

Now we'll upload and view our data set.

Successfully imported the data and viewed it. Now, let's look at the statistics about the data.

See carefully, the data set is very balanced i.e. almost equal numbers of positive and negative classes.

Now, before using the data set in the model, let's do a few things to clear the text.

We made the pre-cleaning of the data ready for use within the model. Before we build our model, let's split our dataset to test (10%) and training(90%).

Now we have to create our model using our training data. While creating the model, I will use the TF-IDF as the vectorizer and the Stochastic Gradient Descend algorithm as the classifier.

We found these methods and the parameters in the method using grid search (I will not mention grid search in this article).

Our model has occurred. Now let's test our model with test data. Let's examine the accuracy, precision, recall and f1 results.


See the success of our model was 83%. Let's look at the confusion matrix, where we can see more clearly how accurate our estimates are.

So now, we have successfully developed a Natural Language Processing project. This is a baseline model. 

-An Article by Rajdeep Das

Visit My linkedIn Porifle 

References: GitHub, DZone 

No comments:

Post a Comment

My Blogs

Doing Sentiment Analysis using Natural Language Process

Introduction:  In the field of Artificial Intelligence, the amount of using Natural Language Processing is increasing heavily. Some common a...