Skip to main content

πŸ§‘β€πŸ’» How to use GPT for writing and coding

Earlier we spoke about how you can use text models. Now we're going to dive into the nitty gritty of GPT and some of the stuff it can do with language and code.

Language Capabilities​

Write​

Large language models excel at generating written text and can be used for a variety of tasks such as writing blog posts, emails, advertisements, website copy, product descriptions, memos, storytelling, brainstorming, question generation and more.

As an example, for an instruction-following model, a prompt could be: "Write a persuasive product description for a new brand of eco-friendly laundry detergent."

Explain & edit​

Large language models can also be used to extract information from a piece of text, such as answering questions, summarizing text, classifying text, and extracting entities.

These capabilities include: asking a dataset to answer questions about an unfamiliar text, summarizing long documents or detailed meeting notes, classifying customer feedback messages by topic or sentiment, and extracting contact information, names, or other entities from a document.

When working with a text that exceeds the token limit of a large language model (such as 4,000 tokens for text-davinci-003 and 2,000 tokens for previous models), it's recommended to divide the text into smaller segments, rank them based on relevance and then query the most relevant segments to the model.

You can even edit a text using the edit endpoint of the Open AI API.

Code Capabilities​

Write code​

An example prompt for writing code with code-davinci-002:

SQL tables (and columns):
* Customers(customer_id, signup_date)
* Streaming(customer_id, video_id, watch_date, watch_minutes)
A well-written SQL query that lists customers who signed up during March 2020 and watched more than 50 hours of video in their first 30 days:
```

Output:

SELECT c.customer_id
FROM Customers c
JOIN Streaming s
ON c.customer_id = s.customer_id
WHERE c.signup_date BETWEEN '2020-03-01' AND '2020-03-31'
AND s.watch_date BETWEEN c.signup_date AND DATE_ADD(c.signup_date, INTERVAL 30 DAY)
GROUP BY c.customer_id
HAVING SUM(s.watch_minutes) > 50 * 60

code-davinci-002 is able to make inferences from variable names; for example, it infers that watch_minutes has units of minutes and therefore needs to be converted by a factor of 60 before being compared with 50 hours.

Explain Code​

Code explanation can be applied to many use cases:

  • Generating in-code documentation (e.g., Python docstrings, git commit messages)
  • Generating out-of-code documentation (e.g., man pages)
  • In an interactive code exploration tool
  • Communicating program results back to users via a natural language interface

An example prompt for explaining code with code-davinci-002:

A SQL query:
```sql
SELECT c.customer_id
FROM Customers c
JOIN Streaming s
ON c.customer_id = s.customer_id
WHERE c.signup_date BETWEEN '2020-03-01' AND '2020-03-31'
AND s.watch_date BETWEEN c.signup_date AND DATE_ADD(c.signup_date, INTERVAL 30 DAY)
GROUP BY c.customer_id
HAVING SUM(s.watch_minutes) > 50 * 60
```
Questions:
1. What does the SQL query do?
2. Why might someone be interested in this time period?
3. Why might a company be interested in this SQL query?
Answers:

Output:

1. The SQL query finds all customers who signed up in March 2020 and watched more than 50 hours of content in the first 30 days after signing up.
2. The time period is interesting because it is the first month of the COVID-19 pandemic.
3. A company might be interested in this SQL query because it can help them understand how the pandemic has affected their business.

Edit Code​

OpenAI's edit endpoint is particularly useful for editing code.

Example text input to code-davinci-edit-001:

def tribonacci(n):
if n == 0:
return 0
elif n == 1:
return 1
elif n == 2:
return 1
elif n == 3:
return 2
else:
return tribonacci(n-1) + tribonacci(n-2) + tribonacci(n-3)

Example instruction inputs:

Add a docstring
Add typing
Improve the runtime
Add a test
Translate to JavaScript (or Rust or Lisp or any language you like)

Example output after improving the runtime and translating to JavaScript:

function tribonacci(n) {
let a = 0;
let b = 1;
let c = 1;
for (let i = 0; i < n; i++) {
[a, b, c] = [b, c, a + b + c];
}
return a;
}

As you can see, code-davinci-edit-001 was able to successfully reduce the function's runtime from exponential down to linear, as well as convert from Python to JavaScript.

Compare Code​

he OpenAI API also features code search embeddings, which can measure the relevance of a section of code to a text query, or the similarity between two sections of code.

OpenAI code search embeddings significantly improved the state-of-the-art on the [CodeSearchNet] evaluation suite, scoring 93.5% versus the previous record of 77.4%.

Read more about OpenAI's code embeddings in the [blog post announcement][Embeddings Blog Post] or [documentation][Embeddings Docs].

Code embeddings can be useful for use cases such as:

  • Code search
  • Codebase clustering & analysis