Original Source Here
Disaster Tweets Identification Model with TensorFlow
Natural Language Processing is one of the Key Applications of Neural Networks. Text Classification is the use case that can apply NLP. We can build a deep learning model from the sketch according to our problem. Else we can use pre-built models which are used for the same kind of problems. Tensorflow hub and Hugging face are popular places that kind of find pre-build Models for NLP problems. Using the pre-build models to our problems is called transfer learning. The advantage of using transfer learning is that we can use fewer data to train the model because the model is already trained for a big amount of similar kinds of data. We also can cut off the time of building the model from the sketch. Modern world developments are going at a rapid speed. So we do not have any free time. So without trying to reinvent the wheel and people try to improve the already build models by using the methods like fine-tuning. But in some cases, you need to build the model from the sketch. In this article, I will tell you to create a tweet classification model using tensor flow.
Here I used the data set in Kaggle. The data set contains tweets with labels. According to the context of the tweet, it is labelled as disaster tweet or non-disaster tweets using this data set I tried to build a model that can identify disaster tweets and non-disaster tweets.
As we know the data set is the main component for the performance of any model. So you need to have a clear idea about the data set before doing any step. So as the first step you need to split the data set into train and test data sets. This data set already has those data sets. So I do not need the separation step. In this problem, I already have a clear idea about the suitable model. But if you do not have an idea about the suitable model, then you need to have a separate validation data set. Because you cannot measure the performance of the model using a test data set. Then I identified that labels are in order. That is not suitable for model training.
import pandas as pdtrain_df=pd.read_csv("train.csv")
Because if we use an ordered data set, then the model will identify patterns for one label and then override the all weight identified for the previous label with the next label. To avoid this issue you need to shuffle the data set according to labels.
#shuffle training dataframe
Then I check for the class imbalance of the data set. In here data set is acceptable. But in your case, you can avoid the class imbalance by adding more data to the minority class (over-sampling), smote analysis or reduce data in the majority class(under-sampling).
#count the examples
Else you will get a good value for accuracy for the classification model but you can see the problem in the confusion matrix. Because the model cannot give good predictions in the minor class.
We know that every machine learning model works with numbers. So always you need to provide the numbers to model. In this case, the data set contains the text. So you need to convert the text into numbers before put them into the model. In NLP Problem their main two methods are called tokenization and embedding to convert text into numbers. So in this case I used text vectorization. Text vectorization is the way of representing text in vectors. You need to have a clear understanding of selecting the suitable method to text number conversion according to your problem.
import tensorflow_hub as hubMODEL_URL="https://tfhub.dev/google/universal-sentence-encoder/4"
"Universal sentences encoder turns sentence into numbers"
In the tenser flow hub, many pre-built models use for text-based problems. In this problem, I used the universal sentence encoder (USE) as the model. This is a good model for text classification, semantic similarity, clustering and other NLP tasks. Always try to use the latest updated version of the selected model. In this model, inputs are converted into 512-dimensional vectors. So you need to have a good idea and need to read the documentation of the model before applying it.
In model creation, I used the data type as a string and the trainable parameter as false at the start. Because I not trying to fine-tune the model at the beginning. Then I used the two dense layers after the embedding layer. The first layer contains 64 neurons while the second dense layer contains one neuron because at the end output is a single unit. Because binary classification only contains one output. I used the binary cross-entropy as the loss function. Then I used Adam as the optimizer because it is suitable for a problem like this. I used accuracy as the metric. I fit the model with the training data set and validate it with the validation data set. I used 5 epochs and used the call back function to create logs of the model creation.
#create keras layer using the use layer from tensorflow hubMODEL_URL="https://tfhub.dev/google/universal-sentence-encoder/4"sentence_encoder_layer=hub.KerasLayer(
name="USE")#Create model using sequentinal api
)#Compile the model
)#trained the classfier on use layer
After those steps, I predict the data with the model. Here I get the results as floating-point values.
But for the measure we need to convert them into 0 and 1 . So I round them to values between 0 and 1.
#convert preds to labels
Then I calculate metrics for the model with predicted labels and true labels.
In addition to this, I need to tell you that always try to use TensorFlow data sets. Always try to convert data sets into TensorFlow data sets before applying them to the model. But if you need you can directly apply data sets. But as a best practice, I think that you always try to convert data sets into TensorFlow data sets.
In this article, I try to tell you about steps to create a tweet classification model using TensorFlow.I think this will help to do similar projects like this. I also need to tell you that not try to use transfer learning methods for every problem. Because transfer learning can be useful for some problems while simple sklearn models can perform well in other problems. So you need to identify pros and cons in every model. In this kind of problem, you need to consider the hardware requirements and time consumption. Here my main aim is to get higher accuracy. So I followed with the transfer learning. But if you need to get the prediction with less time with a small amount of drop of accuracy value then you can choose the naïve Bayes model in sklearn. Because when considering time consumption naïve Bayes model gives results in less time. This problem is called a Speed/Score Tradeoff. So as a developer that is a decision that needs to take by you according to your problem. Because in some problems need results in quick time while some problems need results with higher accuracy.
I am always been a believer in hands-on experience with knowledge. Because you can have good knowledge, but you cannot work well in the project due to lack of experience. So always try to practice these concepts and improve your experience while gaining the knowledge. This will help to have a good career in this field. I think in the modern world you can access anything within in few seconds from the internet. So use that as an advantage for you. I also learn these things by reading articles, watching videos on youtube, and following videos on Udemy(Daniel Bourke/Andrei Neagoie).
This article has explored ways to work with Transfer Learning with TensorFlow I hope will assist you in completing your work more accurately. I’d like to thank you for reading my article, I hope to write more articles on new trending topics in the future to keep an eye on my account if you liked what you read today!
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot