LLM models, like OpenAI's ChatGPT, rely heavily on well-written prompts to generate accurate, useful, and contextually appropriate responses. Crafting these prompts is a skill known as "prompt engineering," which involves structuring queries to align the model's output with user expectations. For beginners, with a step-by-step approach, you can create effective prompts.
Why Writing Better Prompts Matters
When working with AI models, the quality of the input determines the quality of the output. A well-structured prompt can save time, reduce confusion, and yield better results. In this guide, we'll explore how to write better prompts using three core steps:
1. Iterate,
2. Evaluate, and
3. Templatize.
Step 1: Iterate
Iteration means improving your prompt through trial and error.
1. Start Simple: Write a basic prompt addressing your
goal. For example:
"Generate SQL Query from the below metadata."
While straightforward, this prompt might produce results that are generic or incomplete because it lacks context and specificity.
2. Test the Output: Evaluate the model's response to identify gaps. For example, the response might:
o Miss handling constraints or primary keys.
o Generate a query that isn’t aligned with your desired database structure or use case.
3. Refine: Add details to make your intent
clear. For example:
"Generate an SQL SELECT query using the following metadata.
o Table - Employees, Fields - Name, Age, Department
o Constraint - Age > 30, Sort by - Age DESC."
This refined prompt provides the model with explicit instructions about the table, fields, constraints, and sorting order. It reduces ambiguity, ensuring a more relevant and complete response.
4. Iterate Further: If the output still doesn’t meet your
expectations (e.g., missing formatting or a JOIN condition), continue refining.
For instance:
"Generate a properly formatted SQL query that joins the Employees table
with the Departments table on DepartmentID, selects the fields Name, Age, and
DepartmentName, filters by Age > 30, and sorts the results by Age
DESC."
Through iteration, you can transform a generic prompt into a highly specific one that generates accurate and relevant responses.
Why Iteration Matters?
This process ensures the prompt evolves with your requirements. By starting simple and refining based on the output, you make the model work more effectively for your needs.
Example Prompt Template to get SQL query (from BigQuery)
You are an advanced SQL generator that understands BigQuery syntax. Based on the given metadata and user requirements, generate a valid and executable SQL query for BigQuery. ### Metadata: %1$s ### Dataset Information: Dataset Name: %2$s ### Table Details: %3$s ### User Instructions: %4$s ### Constraints: 1. Use only the provided dataset and table(s). 2. Ensure column types are respected (e.g., do not compare integers with strings without casting). 3. Respect primary key constraints and unique identifiers in joins or filters. 4. Incorporate NULL handling if the user instruction implies potential NULL values. ### Output: Strictly analyze the provided user instructions against the table details. If the user instructions do not match or are unrelated to the table metadata, return: {"query": "", "title": "PROMPT DO NOT MATCH TO THE TABLE SELECTION"} If the instructions match, generate a valid BigQuery SQL query and return in the following JSON format: {"query": "constructed query, do not include any explanation", "title": "Title For the Insight, do not include any explanation"} Make sure the query is optimized for BigQuery.
When I feed the metadata, dataset, table details, actual prompt looks like below.
You are an advanced SQL generator that understands BigQuery syntax. Based on the given metadata and user requirements, generate a valid and executable SQL query for BigQuery. ### Metadata: None ### Dataset Information: Dataset Name: abcCorp ### Table Details: - Table Name: Bank_Transaction - Columns: - Name: Date, Type: DATE, Size: N/A, Nullable: FALSE, Default: N/A, Comment: N/A, Precision: N/A, Primary Key: FALSE - Name: Description, Type: STRING, Size: N/A, Nullable: FALSE, Default: N/A, Comment: N/A, Precision: N/A, Primary Key: FALSE - Name: Deposits, Type: FLOAT, Size: N/A, Nullable: FALSE, Default: N/A, Comment: N/A, Precision: N/A, Primary Key: FALSE - Name: Withdrawls, Type: FLOAT, Size: N/A, Nullable: FALSE, Default: N/A, Comment: N/A, Precision: N/A, Primary Key: FALSE - Name: Balance, Type: FLOAT, Size: N/A, Nullable: FALSE, Default: N/A, Comment: N/A, Precision: N/A, Primary Key: FALSE ### User Instructions: Get me total deposits of yesterday ### Constraints: 1. Use only the provided dataset and table(s). 2. Ensure column types are respected (e.g., do not compare integers with strings without casting). 3. Respect primary key constraints and unique identifiers in joins or filters. 4. Incorporate NULL handling if the user instruction implies potential NULL values. ### Output: Strictly analyze the provided user instructions against the table details. If the user instructions do not match or are unrelated to the table metadata, return: {"query": "", "title": "PROMPT DO NOT MATCH TO THE TABLE SELECTION"} If the instructions match, generate a valid BigQuery SQL query and return in the following JSON format: {"query": "constructed query, do not include any explanation", "title": "Title For the Insight, do not include any explanation"} Make sure the query is optimized for BigQuery.
When I submitted above prompt to LLM, I got following response.
{ "query": "SELECT SUM(Deposits) AS total_deposits_yesterday FROM abcCorp.Bank_Transaction WHERE Date = DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY);", "title": "Total Deposits of Yesterday" }
Previous Next Home
No comments:
Post a Comment