Key Takeaways from Andrew Ng × OpenAI's Prompt Engineering Course: Two Core Principles Explained

Andrew Ng's prompt engineering course distilled: write clear instructions and give the model time to think.
This article breaks down the key insights from Andrew Ng and OpenAI's ChatGPT Prompt Engineering for Developers course. It covers the critical distinction between Base LLMs and instruction-tuned models, explains the two core principles of effective prompting — writing clear, specific instructions and giving the model time to think — and highlights the shift from memorizing prompt templates to understanding LLM fundamentals for API-driven development.
Course Background and Value
The course ChatGPT Prompt Engineering for Developers, co-created by Andrew Ng and OpenAI's Iza Fulford, is widely recognized as one of the most systematic introductory tutorials on prompt engineering. Iza Fulford built the popular ChatGPT Retrieval Plugin and has long been dedicated to teaching developers how to leverage Large Language Model (LLM) technology in their products.
It's worth noting that the ChatGPT Retrieval Plugin built by Iza was a landmark project in OpenAI's early plugin ecosystem. The plugin allowed ChatGPT to retrieve information from external knowledge bases in real time, breaking through the limitation of LLM training data cutoff dates. Its technical core involves storing external documents as embedding vectors in a vector database. When a user asks a question, the system first performs a semantic search to find the most relevant document fragments, then injects them as context into the prompt for the model to reference. This architecture later evolved into the widely adopted RAG (Retrieval-Augmented Generation) paradigm, becoming one of the foundational infrastructures for enterprise-grade LLM applications.
The course's core positioning is crystal clear: It's not about memorizing "30 universal prompts." Instead, it's aimed at developers, teaching them how to rapidly build software applications by calling LLMs through APIs. Andrew Ng points out at the very beginning that a vast number of prompt tips articles on the internet focus on one-off tasks in the ChatGPT web interface, but the true power of LLMs as developer tools — rapidly building applications through API calls — remains severely underestimated.
It's important to understand the fundamental difference between API calls and web interface usage. Through API calls, developers can precisely control key parameters such as System Prompt, Temperature, and Max Tokens. They can also embed LLMs into automated workflows, enabling batch processing, chain calls, and integration with other systems. The web interface is essentially a single-turn/multi-turn conversation interface with preset parameters, while the API turns the LLM into a programmable reasoning engine around which developers can build arbitrarily complex application logic.

Andrew Ng's AI Fund team has collaborated with numerous startups to apply these technologies across various scenarios, witnessing firsthand the kinds of products developers can build at remarkable speed using LLM APIs. The course covers prompt best practices, common use cases (summarization, inference, transformation, expansion), and core modules like building chatbots with LLMs.
What's the Difference Between Base LLMs and Instruction-Tuned Models?
The first step in understanding prompt engineering is knowing what kind of model you're talking to. Andrew Ng categorizes LLMs into two major types:
Base LLM: A Foundation Model Based on Text Completion
The training objective of a Base LLM is to predict the next word. Trained on massive amounts of internet text, it learns "what word is most likely to follow the current text."
From a technical perspective, this training paradigm is called Autoregressive Language Modeling. During training, the model takes the first N words of a text passage as input and attempts to predict the probability distribution of the (N+1)th word, then continuously adjusts its parameters through backpropagation to improve prediction accuracy. This seemingly simple objective function, given a sufficiently large dataset and model scale, gives rise to astonishing language understanding and generation capabilities. In the name "GPT," "G" stands for Generative, "P" for Pre-trained, and "T" for Transformer — together describing the three core elements of this technical approach.
For example: if you input "Once upon a time, there was a unicorn," it might continue with "that lived in a magical forest with all its unicorn friends" — a reasonable text continuation. But if you ask "What is the capital of France?", a Base LLM likely won't answer "Paris" directly. Instead, it might continue with "What is the largest city in France? What is the population of France?" — because such Q&A lists genuinely exist in abundance on the internet.

Instruction-Tuned LLM: A Fine-Tuned Model That Follows Instructions
This is the mainstream direction in current LLM research and practice. Instruction-tuned models are specifically trained to follow instructions. Ask the same question — "What is the capital of France?" — and it will directly answer "The capital of France is Paris."
The training process typically involves three steps:
- Base pre-training: Train a Base LLM on massive text data
- Instruction fine-tuning: Further train the model using "instruction–high-quality response" paired data
- RLHF optimization: Use Reinforcement Learning from Human Feedback (RLHF) to make the model better at following instructions
Among these, RLHF is the key technical step that transforms a Base LLM into a practical assistant. The specific process works as follows: first, human annotators rank multiple model outputs, labeling which responses are better; then a Reward Model is trained on this ranking data to learn to predict human preferences; finally, reinforcement learning algorithms such as PPO (Proximal Policy Optimization) are used to further optimize the language model's output strategy, using the reward model's scores as the signal. OpenAI's InstructGPT paper was the first to systematically demonstrate the effectiveness of this process — a 1.3B parameter model trained with RLHF even outperformed a 175B parameter model without RLHF in human evaluations, proving the enormous value of alignment training.

Instruction-tuned models are trained to be helpful, honest, and harmless, which significantly reduces the probability of toxic outputs compared to Base LLMs. These three principles are commonly known as the "HHH" alignment criteria, first systematically proposed by Anthropic in their research papers. Notably, tension sometimes exists between these three dimensions: for instance, a "helpful" model might try its best to answer every question, but this could conflict with the "harmless" principle (e.g., answering questions about manufacturing dangerous items); an "honest" model might point out errors in a user's viewpoint, but this could reduce its perceived "helpfulness." How to strike a balance among these three dimensions is one of the core challenges in current AI Alignment research, and it directly impacts prompt engineering practice — developers need to understand where the model's safety boundaries lie.
Andrew Ng explicitly recommends: For the vast majority of practical applications, developers should use instruction-tuned models.
The Core Mental Model for Prompt Engineering
One of the most insightful analogies in the course is this: Think of using an instruction-tuned LLM as giving instructions to someone who is smart but unfamiliar with your specific task. Imagine the other person as a brilliant recent college graduate — intelligent and capable, but needing you to clearly explain the task.

Principle 1: Make Your Instructions Clear and Specific
When an LLM's output doesn't meet expectations, the problem often isn't the model — it's that the instructions aren't clear enough.
Take "Write something about Alan Turing" as an example. The problem with this instruction is excessive ambiguity:
- Unclear focus: Do you want his scientific achievements, personal life, or his role in history?
- Unclear tone and style: Should it read like a professional journalist's report or a casual note to a friend?
- Missing context: If you can specify what reference materials the model should consider, the results will be much better
Each additional layer of information helps the model narrow down the "space of possible answers," thereby more precisely hitting your actual needs.
Principle 2: Give the Model Enough Time to Think
The second core principle Andrew Ng emphasizes in the course is — Give the LLM time to think. This principle is elaborated in detail by Iza in subsequent lessons. The core idea is: for complex tasks, don't expect the model to produce the answer in one shot; instead, guide it to reason step by step.
This is directly connected to the now widely known "Chain of Thought" (CoT) technique. The Chain of Thought technique was formally introduced by Jason Wei and colleagues from the Google Brain team in a 2022 paper. Research found that when prompts include guiding phrases like "Let's think step by step," or when examples with intermediate reasoning steps are provided, LLM performance on mathematical reasoning, logical reasoning, and commonsense reasoning tasks improves significantly. The intuition behind this is that complex problems require multi-step reasoning, and directly asking the model to output the final answer is equivalent to requiring it to complete all computation in a single forward pass; guiding it to output intermediate steps greatly reduces the reasoning burden at each step. Subsequent research has further developed more advanced reasoning strategies such as Tree of Thought and Self-Consistency, forming an increasingly rich ecosystem of LLM reasoning enhancement techniques.
Practical Insights from the Course
The value of this course lies not only in the technical knowledge itself but also in establishing a systematic prompt engineering mental framework:
- From "memorizing templates" to "understanding principles": Stop relying on fixed prompt templates circulating online; instead, understand how LLMs work and optimize prompts at the principle level
- From "web chat" to "API development": Upgrade LLMs from chat tools to core productivity tools for developers
- From "one-off tasks" to "systematic applications": Build reusable LLM application patterns through standardized use cases like summarization, inference, transformation, and expansion
Interestingly, Andrew Ng specifically mentions that some popular prompt tips on the internet may be more suited to Base LLMs rather than today's mainstream instruction-tuned models. This means that if you're still using some "old-school" prompt techniques (such as excessive role-playing prefixes), you may need to reassess their actual effectiveness on modern models.
Summary
This prompt engineering course co-created by Andrew Ng and OpenAI provides developers with a complete learning path from beginner to advanced. Its core philosophy can be distilled into two sentences: Write your instructions clearly, and give the model time to think. It sounds simple, but truly implementing these two principles in development practice requires a deep understanding of how LLMs work. For engineers looking to integrate AI capabilities into product development, this course remains one of the most worthwhile foundational courses to invest your time in.
Related articles

Claude Code Installation Guide & The Five Stages of AI Programming Tools Explained
Complete Claude Code installation guide with the five stages of AI programming tools, from manual coding to agents. Learn 0-to-1 project building and 1-to-100 iteration challenges.

Enterprise-Level AI Project Rules Files: 5 Hard Rules + 6 Writing Techniques
AI keeps messing up your code? Learn 5 hard rules and 6 writing techniques for enterprise-level Rules files in Claude Code, Cursor & more, with templates.

Building Cloud Computing Clusters from Old Phones: Google and UCSD Explore a New Path to Sustainable Computing
Google and UCSD explore building cloud clusters from old phones, leveraging ARM chip efficiency to cut e-waste and data center carbon footprints.