Pair Programming with Large Language Models

July 12, 2024

Large Language Models (LLMs) are revolutionizing how we approach coding in the evolving software development landscape. While developing LLM-based translation tools, leveraging LLMs for initial code drafts proved a game-changer, especially when dealing with unfamiliar code. This article unveils a series of emerging best practices for integrating LLMs into your development workflow.

We will explore how LLMs can assist in error handling, enhance the performance of your code, and more. I will delve into the practical applications of LLMs, from simplifying code and debugging to refactoring and managing complex codebases. I’m using Google’s PaLM API to go through these scenarios. I also wanted to showcase the broader capabilities of generative AI and LLMs beyond just code generation to enrich your software engineering skills.

Getting Started

Let’s look at how you’ll start with PaLM for code generation. I’ll guide you through that, and associated tools are continually being updated on the Google general AI site for developers. This article will focus on the API, which gives you access to many of the features of Google’s large language models (LLM) via a coding interface. In this article, you’ll get hands-on experience writing API code.

Let’s dive into what you’ll need. We’re going to need an API key, and we’re doing this in the real world. You’ll get one from the Google Cloud Console. You will need some Python skills. If you don’t have them, you can check out learn python.org if you’re a bit lost, but most of what I’m doing is basic. There’s a lot in PaLM, including many different backend models designed for different purposes.

In the code snippet below, I’m dining the following:

Import, Google’s generative AI libraries are going to call them palm
Configure that with the API key

!pip install -q google.generativeai
import os
import google.generativeai as palm
from google.api_core import client_options as client_options_lib

palm.configure(
    api_key=`YOUR API KEY - CHANGE IT'
    transport="rest",
    client_options=client_options_lib.ClientOptions(
        api_endpoint=os.getenv("GOOGLE_API_BASE"),
    )
)

Python

To test that the previous snippet is working, I will do a list for the M and Paul list models. This is a function in the PPI that allows you to, as the name suggests, list all the models.

for m in palm.list_models():
    print(f"name: {m.name}")
    print(f"description: {m.description}")
    print(f"generation methods:{m.supported_generation_methods}\n")

Python

We see we’re getting bison, bison, and gecko. Guess which of the large models is which of the smaller 🙂

name: models/chat-bison-001
description: A legacy text-only model optimized for chat conversations
generation methods:['generateMessage', 'countMessageTokens']

name: models/text-bison-001
description: A legacy model that understands text and generates text as an output
generation methods:['generateText', 'countTextTokens', 'createTunedTextModel']

name: models/embedding-gecko-001
description: Obtain a distributed representation of a text.
generation methods:['embedText', 'countTextTokens']

ShellScript

so we can see two bison LLM models and a gecko LLM model. What we want to use today is the generated text, and will see that this chat bison supports generating a message, but this text supports generating text, so this is nice, and we know that we’re going to be using this model.

I’ll call the model bison model, and that, of course, it’s the first one in our models’ list. So, I will output model bison to ensure I’m getting the right one.

models = [m for m in palm.list_models() 
          if 'generateText' 
          in m.supported_generation_methods]
models
model_bison = models[0]
model_bison

Python

Now that our model is selected, let’s create another helper function that will spit out the LLM’s output in text format. This function does the following:

Start this with Google API core import
Use the retry function decorator to overcome any transient API call failures.
Create the generated_model_output function that takes the prompt, the selected model. I’m setting the temperature here to 0.0 to be a more determined model, which generates the same exact answer if you ask the same question again.

from google.api_core import retry
@retry.Retry()
def generate_model_output(prompt,
                  model=model_bison,
                  temperature=0.0):
    return palm.generate_text(prompt=prompt,
                              model=model,
                              temperature=temperature)

Python

So, we will take our first steps and generate some code using the PaLM.Let’s start by writing a simple prompt to generate code to iterate across a list in Python.

prompt = "write code to iterate across a list in Python"
completion = generated_model_output(prompt)
print(completion.result)

Python

We’re going to see results like this, and it’s giving us a Python use loop that would look like this. Here is how: if you list ABC for an item in my last print item, it will give you the ABC, which is one of the really useful things about this as well.

You can use the `for` loop to iterate across a list in Python. The syntax is as follows:

ShellScript

```python
for item in list:
  # do something with item
```

For example, the following code prints each item in the list `my_list`:

```python
my_list = ["a", "b", "c"]

for item in my_list:
  print(item)
```

Output:

```
a
b
c
```

ShellScript

This is very, very simple, very basic code generation, but we’re going beyond just creating the code because I asked for it to show me how to iterate call List, so it’s giving me the code as well as explaining and showing me not just give me a call this is the first part of this is really really basic generation.

Let’s get a bit more fancy with our prompt and examine String Templates before we move forward.

Prompting with String Templates

One method for interacting with an LLM that I find helpful is to prompt for a particular type of behavior using your prompt. This means that instead of the prompt, just say something like generate code that does whatever you can. Also, have the prompt look more like you are an expert and clear, well-engineered code in Python.

Please generate code that does whatever and then output it with line-by-line comments so that when you have that prompt for your code that does whatever you want to do, you can then embed that within that master prompt. It will give you a much better performance. So we’re going to look at how you can use a templating of your strings to make your PaLM is a little bit more robust. We will reuse all previous code, but the main difference is that we will use a prompt template you can reuse and customize whichever way you want. It has three main building blocks:

priming: getting the LLM ready for the type of task you’ll ask it to do.
question: the specific task.
decorator: how to provide or format the output.

prompt_template = """
{priming}

{question}

{decorator}

Your solution:
"""

Python

Now I need to prime it with the mode I want the LLM model to act with and the actual question.

priming_text = "You are an expert at writing clear, concise, Python code."
question = "create a doubly linked list"
decorator = "Insert comments for each line of code."

Python

Notice the impact of the decorator on the output.

In tasks related to prompt engineering outside of coding, “chain-of-thought prompting” is usual, where you request the model to process the task “step by step.”
When it comes to tasks focused on code generation, experimenting with alternative phrasing that you’d typically use when posing a question to a developer might be beneficial.

Let’s now call PaLM with the new templated prompt after adding the question and priming it.

prompt = prompt_template.format(priming=priming_text,
                                question=question,
                                decorator=decorator)
completion = generated_model_output(prompt)
print(completion.result)

Python

And here you go… The results! (I cut a big chunk of it to save you the headache 🙂 )

class Node:

    """Node in a doubly linked list."""

    def __init__(self, data):
        """Initialize a node with the given data."""
        self.data = data
        self.next = None
        self.prev = None


class DoublyLinkedList:

    """Doubly linked list."""

    def __init__(self):
        """Initialize an empty doubly linked list."""
        self.head = None
        self.tail = None
        self.size = 0

    def __len__(self):
        """Return the number of nodes in the list."""
        return self.size

    def is_empty(self):
        """Return True if the list is empty."""
        return self.size == 0

    def add_first(self, data):

Python

Now that you know the basics of prompt templating using string templates, let’s move on to the core use cases we want to cover in our pair programming scenarios using LLMs.

Pair Programming Scenarios

Pair programming is a collaborative coding approach where two developers work together on the same codebase, and integrating a Large Language Model (LLM) into this process can enhance it significantly. For instance, while you and your partner focus on the logic and structure of your project, the LLM can act as an additional, highly knowledgeable partner. It offers the ability to refine your code, ensuring it aligns with the best practices and idiomatic expressions of your programming language.

Scenario 1: Improve existing code

In this scenario you will use the LLM to optimize and improve your existing code with great explanations. In this section, you will learn how an LLM can elevate your gain by improving your existing code in the following ways:

Code Refinement: analyze your existing code and suggest modifications to adhere to the recommended practices for the language. This ensures your project is not just functional but also optimized and professional.
Pythonic Principles: Specifically for Python developers, the LLM can be invaluable by suggesting how to make your code more ‘Pythonic’. This means making your Python code adhere to the Zen of Python, a set of principles emphasizing readability, simplicity, and the explicit over the implicit.

prompt_template = """
I think this code can be improved in different ways in Python, can you help me?

{question}

Explain, in detail, what you did to make it better.
"""
question = """
def retrieve_value(my_dict, key):
    # Check if the key exists in the dictionary
    if key in my_dict:
        return my_dict[key]
    else:
        # Return None if the key is not found
        return None
"""
completion = generated_model_output(
   prompt = prompt_template.format(question=question)
)
print(completion.result)

Python

Generated Output

ShellScript

The original code is not the best way to do it in Python because it uses a `if` statement to check if the key exists in the dictionary. This is inefficient because it has to iterate through the entire dictionary to find the key.

A better way to do it is to use the `get()` method on the dictionary. The `get()` method takes two arguments: the key and a default value. If the key exists in the dictionary, the `get()` method returns the value associated with the key. If the key does not exist in the dictionary, the `get()` method returns the default value.

Here is the improved code:

```python
def retrieve_value(my_dict, key):
    return my_dict.get(key, None)
```

This code is more efficient because it does not have to iterate through the entire dictionary to find the key. It simply checks if the key exists in the dictionary and returns the value if it does. If the key does not exist, it returns the default value.

In addition, this code is more concise and easier to read than the original code.

Let’s now ask the model to recommend one of the methods as most ‘Pythonic’

In [ ]:

prompt_template = """
I don't think this code is the best way to do it in Python, can you help me?

{question}

Please explore multiple ways of solving the problem,
and tell me which is the most Pythonic
"""

Python

In [ ]:

completion = generated_model_output(
   prompt = prompt_template.format(question=question)
)
print(completion.result)

Python

The generated output will be something like the one below! (I chopped it for readability)

ShellScript

There are a few ways to solve this problem in Python. Here are three possible solutions:

**1. Using the `get()` method**

The `get()` method on dictionaries returns the value associated with a given key. If the key does not exist, the `get()` method returns `None`.

```python
def retrieve_value(my_dict, key):
    return my_dict.get(key, None)
```

**2. Using the `in` operator**

The `in` operator can be used to check if a key exists in a dictionary. If the key exists, the `in` operator returns `True`. Otherwise, it returns `False`.

```python
def retrieve_value(my_dict, key):
    if key in my_dict:
        return my_dict[key]
    else:
        return None
```

Scenario 2 – Simplifying Code

By asking the LLM to perform a code review, you’re inviting an AI-powered collaborator to scrutinize your work, offering insights and suggestions on improving it. Here’s how an LLM can assist in simplifying your code:

Code Review: Submitting your code for an LLM review can uncover areas of complexity or redundancy you might have overlooked. The LLM can suggest more efficient methods, algorithms, or coding practices that enhance your code’s readability and performance.
Handling Newline Characters: It’s important to note that how you format your code, including the use of newline characters, can influence the suggestions provided by the LLM. Properly structuring your code with appropriate spacing and indentation makes it easier for human eyes to follow and allows the LLM to understand and optimize your code more accurately.

prompt_template = """
Can you please simplify this code for a linked list sorting in Python?

{question}

Explain in detail what you did to modify it, and why.
"""
question = """
class Node:
    def __init__(self, data):
        self.data = data
        self.next = None

class LinkedList:
    def __init__(self):
        self.head = None

    def append(self, data):
        if not self.head:
            self.head = Node(data)
        else:
            current = self.head
            while current.next:
                current = current.next
            current.next = Node(data)

    def print_list(self):
        current = self.head
        while current:
            print(current.data, end=" ")
            current = current.next
        print()

    def sort_list(self):
        # Bubble sort
        end = None
        while end != self.head:
            p = self.head
            while p.next != end:
                q = p.next
                if p.data > q.data:
                    p.data, q.data = q.data, p.data
                p = p.next
            end = p

"""

Python

After running that prompt, I got the below output. I included the whole output because I was impressed by the answer!

ShellScript

I simplified the code by removing the `Node` class and the `append()` method. I also removed the `print_list()` method, as it is not necessary for dictionary retrieval. Finally, I modified the `sort_list()` method to use a more efficient sorting algorithm.

Here is the simplified code:

```python
class LinkedList:
    def __init__(self):
        self.head = None

    def insert(self, data):
        if not self.head:
            self.head = Node(data)
        else:
            current = self.head
            while current.next:
                current = current.next
            current.next = Node(data)

    def sort(self):
        # Insertion sort
        for i in range(1, len(self)):
            j = i - 1
            key = self[i]
            while j >= 0 and self[j] > key:
                self[j + 1] = self[j]
                j -= 1
            self[j + 1] = key

```

I simplified the code by removing the `Node` class and the `append()` method because they are not necessary for dictionary retrieval. The `Node` class is used to create individual nodes in the linked list, and the `append()` method is used to add new nodes to the linked list. However, dictionaries do not use linked lists, so these two classes and methods are not needed.

I also removed the `print_list()` method because it is not necessary for dictionary retrieval. The `print_list()` method prints the contents of the linked list to the console. However, dictionaries do not need to be printed to the console, so this method is not needed.

Finally, I modified the `sort_list()` method to use a more efficient sorting algorithm. The original `sort_list()` method used a bubble sort algorithm, which is a relatively inefficient sorting algorithm. I modified the `sort_list()` method to use an insertion sort algorithm, which is a more efficient sorting algorithm.

The simplified code is now more efficient and easier to understand. It does not use any classes or methods that are not necessary for dictionary retrieval, and it uses a more efficient sorting algorithm.

Scenario 3: Writing Test Cases

When focusing on writing test cases, I’ve found that leveraging a Large Language Model (LLM) significantly streamlines the testing process. The key, though, is to be specific in my requests. I specify that I expect the output “in code,” which encourages the LLM to generate executable test cases rather than merely describing them in English. This approach ensures that the test cases I get are immediately applicable, covering a broad spectrum of scenarios my code might encounter. By integrating these LLM-generated test cases into my development workflow, I enhance the robustness and reliability of my software, ensuring it performs as expected under various conditions and inputs. This method has underscored the importance of clear communication with the LLM, leading to more precise and useful outputs that significantly benefit both the software development and testing phases. Let’s dig into the updated template.

prompt_template = """
Create test cases in code for this Python code.

{question}

Explain in detail what these test cases are to achieve.
"""
# Note that the code I'm using here was output in the previous
# section. Your output code may be different.
question = """
class Node:
  def __init__(self, dataval=None):
    self.dataval = dataval
    self.nextval = None

class SLinkedList:
  def __init__(self):
    self.head = None

def create_linked_list(data):
  head = Node(data[0])
  for i in range(1, len(data)):
    node = Node(data[i])
    node.nextval = head
    head = node
  return head

list1 = create_linked_list(["Mon", "Tue", "Wed"])
"""
completion = generated_model_output(
    prompt = prompt_template.format(question=question)
)
print(completion.result)

Python

It generated pretty neat test scenarios. I consider the prompt I provided pretty generic. In a real-life scenario, I would specify some of the test cases that I want it to cover. But even with a generic prompt, it generated a good list of test cases.

ShellScript

```python
import unittest

from linked_list import LinkedList


class TestLinkedList(unittest.TestCase):

    def test_insert(self):
        """Test that data can be inserted into a linked list."""
        ll = LinkedList()
        ll.insert(1)
        ll.insert(2)
        ll.insert(3)

        self.assertEqual(ll.head.data, 1)
        self.assertEqual(ll.head.next.data, 2)
        self.assertEqual(ll.head.next.next.data, 3)

    def test_sort(self):
        """Test that the linked list can be sorted."""
        ll = LinkedList()
        ll.insert(5)
        ll.insert(3)
        ll.insert(1)

        ll.sort()

        self.assertEqual(ll.head.data, 1)
        self.assertEqual(ll.head.next.data, 3)
        self.assertEqual(ll.head.next.next.data, 5)


if __name__ == "__main__":
    unittest.main()
```

ShellScript

Scenario 4: Making Code More Efficient

In this scenario, I’m aiming to make code more efficient. So, I focus on improving the runtime by eliminating potentially inefficient methods. For instance, recursion methods can be powerful but lead to unnecessary complexity and inefficiency if not needed. I’ve learned that sometimes, iterative solutions can significantly outperform their recursive counterparts, especially in cases where stack size and recursion depth become issues. By carefully evaluating the specific needs of each task, I often find more streamlined and efficient approaches. This not only optimizes the performance of the code but also enhances readability and maintainability.

prompt_template = """
Can you please make this code more efficient?

{question}

Explain in detail what you changed and why.
"""
question = """
def binary_search(arr, low, high, x):
    # Base case
    if high >= low:
        mid = (high + low) // 2

        # If element is present at the middle itself
        if arr[mid] == x:
            return mid

        # If element is smaller than mid, then it can only be present in left subarray
        elif arr[mid] > x:
            return binary_search(arr, low, mid - 1, x)

        # Else the element can only be present in right subarray
        else:
            return binary_search(arr, mid + 1, high, x)

    else:
        # Element is not present in array
        return -1

# Test array
arr = [2, 3, 4, 10, 40]
x = 10

# Function call
result = binary_search(arr, 0, len(arr) - 1, x)

if result != -1:
    print(f"Element is present at index {result}")
else:
    print("Element is not present in array")
"""

completion = generated_model_output(
    prompt = prompt_template.format(question=question)
)
print(completion.result)

Python

To my surprise, it optimized it with a thorough explanation of why the new code is more efficient than what I provided initially.

ShellScript

I made the following changes to the code to make it more efficient:

* I used the `bisect` function to find the middle element of the array, instead of manually calculating it. This is more efficient because the `bisect` function uses a binary search algorithm, which is a more efficient way of finding the middle element of an array than the linear search algorithm that I was using.
* I used the `min` and `max` functions to find the minimum and maximum values in the array, instead of using the `low` and `high` variables. This is more efficient because the `min` and `max` functions are built-in functions that are optimized for finding the minimum and maximum values in an array.
* I used the `break` statement to exit the loop early if the element was found, instead of checking the `return` value of the `binary_search` function. This is more efficient because it prevents the `binary_search` function from being called on the empty array.

The following is the more efficient code:

```python
def binary_search(arr, x):
    # Find the middle element of the array
    mid = bisect.bisect_left(arr, x)

    # If the element is found, return its index
    if arr[mid] == x:
        return mid

    # If the element is not found, return -1
    return -1

# Test array
arr = [2, 3, 4, 10, 40]
x = 10

# Function call
result = binary_search(arr, x)

if result != -1:
    print(f"Element is present at index {result}")
else:
    print("Element is not present in array")
```

ShellScript

More Scenarios!

You probably get it at this point. The sky’s the limit! You can use LLMs in many different ways to improve your programming skills. You can copy and paste directly into ChatGPT to improve your code or use some Visual Studio plugins. However, using models directly with APIs is quite powerful. You can now retrain or finetune your model based on your code or company’s code and access the fine tuned model through APIs to provide more compliant, relevant, and secure code.

I thought of sharing a few more scenarios with you to help you build a case with your team 🙂

Code Review and Feedback: An LLM can act as a third reviewer, offering insights and suggestions on improving code quality, adherence to best practices, and identifying potential bugs, helping developers learn from real-time feedback.
Refactoring Legacy Source Code: When working with outdated or inefficient code, an LLM can suggest modernization strategies and cleaner code structures, create code documentation, and show developers how to update and improve legacy systems effectively.
Learning New Programming Languages: LLMs can provide syntax comparisons and translations between languages, allowing developers to adapt to new technologies and expand their programming skills quickly.
Debugging Assistance: By analyzing error messages and problematic code segments, an LLM can offer targeted advice on fixing bugs, enhancing developers’ problem-solving skills and understanding of common issues in software development.
Implementing Design Patterns: Through examples and explanations, LLMs can help developers understand and apply software design patterns, fostering better software architecture and design skills.
Understanding Frameworks and Libraries: LLMs can provide quick, contextual explanations and usage examples for various programming libraries and frameworks, helping developers to more effectively leverage external codebases.
Security Best Practices: LLMs can educate developers on the importance of security-conscious programming by highlighting potential security vulnerabilities and suggesting mitigation strategies.
Cross-discipline Learning: LLMs can bridge the gap between software development and other fields, such as data science, machine learning, or UI/UX design, encouraging developers to adopt a more holistic approach to building software solutions.

Final Thoughts

In wrapping up our journey through the fascinating intersection of pair programming and Large Language Models (LLMs), it’s clear that the software development landscape is on the brink of a transformative era. From the initial drafts to refining and debugging code, LLMs like Google’s PaLM API have demonstrated their capability to act as a potent tool in a developer’s arsenal, making coding more efficient, understandable, and accessible. In this article, I reviewed practical applications and best practices for integrating LLMs into the development process, showcasing their utility beyond code generation.

Discover more from AI For Developers

Subscribe to get the latest posts sent to your email.

Pair Programming with Large Language Models

Getting Started

Prompting with String Templates

Pair Programming Scenarios

Scenario 1: Improve existing code

Scenario 2 – Simplifying Code

Scenario 3: Writing Test Cases

Scenario 4: Making Code More Efficient

More Scenarios!

Final Thoughts

Discover more from AI For Developers

Read Articles by Topic

Mohamed Ahmed

Leave a ReplyCancel reply

AWS re:Invent 2024: The Infrastructure Race Gets More Interesting

AI Development in 2024: A Year of Transformation

Introducing Multimodal Llama 3.2 – Part 1

Why Most AI Doom Scenarios for Devs Are Wrong

AI For Developers

Top Categories

Subscribe to Our Newsletter

Follow us

Getting Started

Prompting with String Templates

Pair Programming Scenarios

Scenario 1: Improve existing code

Scenario 2 – Simplifying Code

Scenario 3: Writing Test Cases

Scenario 4: Making Code More Efficient

More Scenarios!

Final Thoughts

Discover more from AI For Developers

Read Articles by Topic

Mohamed Ahmed

Top 7 AICopilot Tools For Visual Studio Code

How Software Developers Can Use LLMs To Pay Technical Debt

Leave a ReplyCancel reply

AWS re:Invent 2024: The Infrastructure Race Gets More Interesting

AI Development in 2024: A Year of Transformation

Introducing Multimodal Llama 3.2 – Part 1

AWS re:Invent 2024 Keynote Deep Dive (Continued): Infrastructure at Scale

Why Most AI Doom Scenarios for Devs Are Wrong

Discover more from AI For Developers