How to Paraphrase Text using Python



Original Source Here

1. Launching a Google Colab Notebook

We’re going to perform the text paraphrasing on the cloud using Google Colab, which is an online version of the Jupyter notebook that allows you to run Python code on the cloud. If you’re new to Google Colab, you will want to brush up on the basics in the Introductory notebook.

  1. Log into your Gmail account, then go to Google Colab.
  2. Launch the tutorial notebook by first heading over to File > Open Notebook and then click on the Upload tab (far right).
  3. Type dataprofessor/parrot into the search box
  4. Click on the parrot.ipynb file
Screenshot of loading the PARROT tutorial notebook.

2. Installing the PARROT Library

The PARROT library can be installed via pip by typing the following into the code cell:

! pip install git+https://github.com/PrithivirajDamodaran/Parrot.git

Library installation should take a short moment.

Screenshot showing the installation of the PARROT Python library.

3. Importing the Libraries

Here, we’re going to import 3 Python libraries consisting of parrot, torch and warnings. You can go ahead and type the following (or copy and paste) into a code cell then run it either by pressing the CTRL + Enter buttons (Windows and Linux) or the CMD + Enter buttons (Mac OSX). Alternatively, the code cell can also be run by clicking on the play button found to the left of the code cell.

from parrot import Parrot
import torch
import warnings
warnings.filterwarnings("ignore")
Screenshot of the play button that allows the code cell to be run.

The parrot library contains the pre-trained text paraphrasing model that we will use to perform the paraphrasing task.

Under the hood, the pre-trained text paraphrasing model was created using PyTorch (torch) and thus we’re importing it here in order to run the model. This model is called parrot_paraphraser_on_T5 and is listed on the Hugging Face website. It should be noted that Hugging Face is the company that develops the transformer library which hosts the parrot_paraphraser_on_T5 model.

As the code implies, warnings that appears will be ignored via the warnings library.

4. Reproducibility of the Text Paraphrasing

In order to allow reproducibility of the text paraphrasing, the random seed number will be set. What this does is produce the same results for the same seed number (even if it is re-run multiple times).

To set the random seed number for reproducibility, enter the following code block into the code cell:

def random_state(seed):
torch.manual_seed(seed)
if torch.cuda.is_available():
torch.cuda.manual_seed_all(seed)

random_state(1234)

5. Load the Text Paraphrasing Model

We will now load and initialize the PARROT model by entering the following into a code cell and run the cell.

parrot = Parrot(model_tag="prithivida/parrot_paraphraser_on_T5", use_gpu=False)

The models will be loaded as shown below:

Screenshot of initialized model.

6. Input Text

The input text for this example, which is What’s the most delicious papayas?, will be assigned to the phrases variable, which we will be using in just a moment.

To find out the answer to that make sure to watch the accompanying YouTube video (How to paraphrase text in Python using the PARROT library (Ft. Ken Jee)).

phrases = ["What's the most delicious papayas?"]

7. Generating the Paraphrased Text

Now, to the fun part of generating the paraphrased text using the PARROT T5 model.

7.1. The Code

Enter the following code block into the code cell and run the cell.

for phrase in phrases:
print("-"*100)
print("Input_phrase: ", phrase)
print("-"*100)
para_phrases = parrot.augment(input_phrase=phrase)
for para_phrase in para_phrases:
print(para_phrase)

7.2. Line-by-line Explanation

Here, we’ll be using a for loop to iterate through all the sentences in the phrases variable (in the example above we assigned only a single sentence or a single phrase to this variable).

For each phrase in the phrases variable:

  • Print out the - character for 100 times.
  • Print "Input phrase: " followed by the returned output of the phrase that is being iterated.
  • Print out the - character for 100 times.
  • Perform the paraphrasing using the parrot.augment() function that takes in as input argument the phrase being iterated. Generated paraphrases are assigned to the para_phrases variable.
  • Perform a nested for loop on the para_phrases variable:
    — Print the returned output of the paraphrases from the para_phrases variable that have been generated iteratively (the 4 paraphrased text that we will soon see in the next section).

7.3. Code Output

This code block generates the following output:

Here, we can see that PARROT produces 4 paraphrased text and you can choose any of these for further usage.

8. What’s Next?

Congratulations, you can successfully produced paraphrased text using AI!

In case you’re interested in taking this a step or two further.

Here are some project ideas that you can try out and build to expand your own portfolio of projects. Speaking of portfolios, you can learn how to build a portfolio website for free from this recent article that I wrote:

Project Idea 1

Create a Colab/Jupyter notebook that expands on this example (which generates paraphrased text for a single input phrase) by making a version that can take in multiple phrases as input. For example, we can assign a paragraph consisting of a couple of phrases to an input variable, which is then used by the code to generate paraphrased text. Then for the returned outputs of each phrase, randomly select a single output to represent each of the phrase (i.e. each input phrase will correspondingly have 1 paraphrased text). Combine the paraphrased phrases together into a new paragraph. Compare the original paragraph and the new paraphrased paragraph.

Project Idea 2

Expand on Project Idea 1 by making it into a web app using Streamlit (Also check out the Streamlit Tutorial Playlist) or PyWebIO. Particularly, the web app would take as input a paragraph of phrases and applies the code to generate paraphrased text and return them as output in the main panel of the web app.

Share Your Creations

I’d love to see some of your creations and so please feel free to post them in the comment section. Happy creation!

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: