Adventures in ChatGPT 4: Side Quest – How to be a Prompt Wizard

Ten points to Hufflepuff for Dall-E this week. It obviously had the same idea as me on how to be a prompt wizard… buy a watch.

And by “this week” – I mean four months. The irony of joking about a prompt wizard whilst being four months late isn’t lost on me.

Anyway, as promised and trailed by the title, we’re making a little swerve this week from being driven by ChatGPT through a project. Since I noticed that I seem to get wildly different quality of answers, depending on my questions, I thought I would endeavour to learn how to become a “prompt wizard”. So I did a bit of research and, like a good tester, a bit of an experiment. And I learned a few handy hints which I’ll share. I think “wizardly” might be a bit too generous for what I can do now, but I can at least manage a little bit of prompt magic.

I’ll summarise at the end, so you don’t have to keep notes, but the first thing I noticed last week year is that ChatGPT seems to do better with detail. I also mused that issues are caused by ChatGPT being short of context. A little search on the dark arts suggested that one way to help with context was role-play (wizards? role play? Is this AI or AD&D?) and another was to use multiple iterations to refine the prompt. So I tried all of that and I’ll show you what worked.

Normally at this stage of a project I like to get a test definition document together to share with the client. So I thought I’d get some help with that, using my new-found magical powers.

Naturally, a performance tester needs a good baseline, so I asked ChatGPT a brief and (mostly) context free question: “Please prepare a performance test definition document for a flight booking website“. In return it gave me a largely generic, but not terrible, structure. You can see here.

I then tried again with more detail, context and role.

As a performance tester, I need to create a test definition document for my client. I am testing a flight booking website, using 2 processes “search for flight” and “book a flight”. I will be using 31 virtual users, 28 for the “search” and 3 for the “book”. The target response time is 500ms for each transaction and memory and cpu should both be less than 75%. The project has a 6 week window from start to final report. I will run a peak load and an endurance load for 24 hours. Please create my document, including a 6 week plan, the load profile, the success criteria, monitoring, a RACI and any risks.

OK, I kinda spoon-fed it but the results were 100 times better. Full chat is here, but as an example it broke my work up for me nicely:

Test Plan (6-Week Timeline)

  • Week 1: Requirement Analysis & Test Environment Setup
    • Understand client requirements.
    • Set up the test environment.
  • Week 2-3: Script Development & Test Design
    • Develop test scripts for ‘Search for Flight’ and ‘Book a Flight’.
    • Design load test scenarios.
  • Week 4: Test Execution – Dry Run
    • Execute dry runs to validate test scripts and environment.
  • Week 5: Full Load & Endurance Test Execution
    • Conduct peak load test.
    • Execute 24-hour endurance test.
  • Week 6: Result Analysis & Report Preparation
    • Analyze test results.
    • Prepare and submit the final report.

I then asked it to tell me what tools I might need, and again it did rather well. I could feel my natural cynicism melting. It liked JIRA for project management, LoadRunner or JMeter for the test, made lots of use of Excel and suggested all manner of tools for monitoring (Dynatrace, Grafana, Prometheus). Clearly it’s happy to give my clients a wide range of budget choices.

Then, feeling like I wanted some extra help, I asked it to enlarge all the areas. And hey presto, it abandoned some of the more expensive tools (weirdly), gave me a bunch of risks (usefully) and generally made a nice document template I could work with.

So what magic powers did I learn. These four “spells” seemed to be helpful, as mentioned above.

Detail – the more you give, the better the answer. It seems to extrapolate from less to more detail so if you give it more, it does better

Context – probably a corollary of adding detail is adding context. “Give me a plan” does less well than “give me a 6 week plan”, for example

Role – as mentioned by everyone on the planet, saying who is “creating” the document helps

Iterations – it’s also good to keep pushing, asking for expansion or clarification on areas where you want more

All of this has certainly helped me get beyond the stage of receiving endless lists of things. There is one more thing that can help though, and that’s…

Creativity – be prepared to use the tool in an interesting way, throwing it a few off-beat instructions. It seems to be where the tool shines in a way.

As an entirely useless example of this – I asked it to translate the whole plan into Shakespearean language. It went full tilt into that and you can see the results in the chat linked above. But as an example, here’s my weekly plan, Elizabethan style:

The Chronicle of the Six-Week Quest

  • Fortnight the First: Gathering of Requirements & Conjuring of the Testing Sphere
    • Tools: JIRA, a scroll of tasks and timelines; Docker, a mystical cauldron for the brew of test environments.
  • Fortnights Second and Third: Scripting and Designing of the Test
    • Tools: Apache JMeter, a wizard’s staff for scripting; a tool of versatility for diverse scenarios.
  • The Fourth Fortnight: Preliminary Joust of Testing
    • Tools: JMeter for the trial; Jenkins, a squire to integrate the tests into the flow of CI/CD.
  • The Fifth Fortnight: Testing Under the Sun and Moon
    • Tools: JMeter, a steadfast steed for the tournament of load.
  • The Sixth and Final Fortnight: Analysis of the Oracle and Scribing of the Report
    • Tools: The Excel of Microsoft, a sage’s crystal for analysis; Power BI, a scryer’s tool for revelations in report.

I always knew there was something weird about Docker…

Happy System Smashing!