35 Lines of Prompts Let Codex Auto-Optimize Your Workflow — Reposted by OpenAI's President

An OpenAI employee recently shared an exciting technique: with just 35 lines of prompts, you can have Codex automatically analyze your past 30 days of work history, identify repetitive tasks, and package them into reusable automated Skills. The tweet even earned a personal repost and like from OpenAI's president.

Core Principle: Let AI Audit Your Work Habits

The core idea behind this technique is crystal clear — feed a carefully crafted prompt to Codex, have it look back through your past 30 days of conversation history and task data, and mine out the things you do repeatedly every day.

Things like repeatedly loading documents, fixing bugs, organizing materials, writing weekly reports… these seemingly trivial but daily time-consuming operations are all identified one by one by Codex.

Codex Technical Background: OpenAI Codex is a large language model fine-tuned specifically for code understanding and generation tasks based on the GPT architecture, originally released in 2021 as the underlying engine for GitHub Copilot. In 2025, OpenAI relaunched a cloud-based AI coding agent under the Codex name, upgrading its positioning from a simple code completion tool to an intelligent agent system capable of autonomous planning and multi-step task execution. The new Codex runs in a sandboxed environment, can read and write files, execute terminal commands, call external APIs, and accumulate contextual memory through continuous interaction with users — essentially combining LLM language understanding capabilities with OS-level execution abilities.

Codex scheduled check task automation workflow

After identification, Codex categorizes and handles tasks by their nature:

Reusable ones: Packaged directly into Skills (skill templates) for one-click invocation next time
Ones requiring specialized roles: Dispatched to Sub-Agents for execution
Scheduled check tasks: Set up as automated workflows that don't need human monitoring

Skill and Agent Architecture Explained: In AI agent system design paradigms, a Skill refers to the mechanism of encapsulating a reusable task logic into a standardized module, similar to functions or microservices in software engineering. When an agent identifies a highly repetitive task, it abstracts the execution steps, required parameters, and expected outputs into a callable Skill that can be directly invoked for similar future tasks without re-reasoning. Sub-Agents are a core concept in multi-agent architectures: the main agent (Orchestrator) handles task decomposition and scheduling, delegating specific subtasks to specialized sub-agents for parallel or sequential execution. This layered architecture borrows from the microservices philosophy in software engineering, significantly improving processing efficiency and maintainability for complex tasks. It's the mainstream design pattern in current AI Agent frameworks (such as AutoGPT, LangGraph, OpenAI Swarm).

Crucially, Codex doesn't blindly automate everything. It first makes a judgment: Has this task appeared at least twice? Will it continue to occur in the future? Is the process stable enough? Is it worth the automation investment? Only when all these conditions are met does it start taking action.

Screen Reading + Long-Term Memory: Breaking Beyond the Chat Box

Codex recently launched a screen reading feature, taking this entire technique to the next level.

With this capability enabled, Codex can not only analyze your operations within the chat interface but also "see" what you're doing in browsers, office software, email clients, and other applications. In other words, it can capture repetitive behavior patterns outside of Codex itself.

Screen Reading Technical Principles: Codex's screen reading feature is an extension of multimodal perception capabilities, with its underlying technology relying on computer vision models (such as GPT-4V/GPT-4o's visual understanding capabilities) to semantically parse screenshots or real-time screen content. Technically, the system periodically captures user screen content, extracts text information through OCR (Optical Character Recognition), while simultaneously using vision models to understand UI element layouts and states, then converts this information into structured behavior logs for language model analysis. This capability is highly similar to the technical approaches of Anthropic's Claude Computer Use and Google's Project Mariner, representing an important trend of AI penetrating from "language space" into "operational space." Its core value lies in breaking the limitation where AI can only perceive content actively input by users, enabling it to passively observe users' real workflows.

Codex workflow automation isn't just for programmers

Combined with Codex's memory feature — the ability to long-term remember your personal preferences, project context, and historical corrections — Codex is increasingly resembling an AI colleague that observes your work habits and proactively helps streamline your workflow.

Technical Implementation of Long-Term Memory: Large language models are inherently stateless — after each conversation ends, the model doesn't automatically retain any information. AI systems typically implement "long-term memory" through external storage mechanisms: serializing important information (user preferences, project context, historical decisions) into vector databases (such as Pinecone, Weaviate) or structured databases, then injecting relevant memories into the context window via RAG (Retrieval-Augmented Generation) technology when new conversations begin. The memory features OpenAI has implemented in ChatGPT and Codex are precisely the productization of this architecture. The quality of the memory system directly determines the degree of "personalization" of an AI agent — the more precise the memory, the better the agent understands users' work habits and implicit preferences, thereby reducing repetitive communication costs. This is also the technical prerequisite enabling Codex to "proactively audit work habits."

Moreover, this approach is far from limited to programmers. Writers, operations managers, planners — anyone whose work involves repetitive labor can benefit from it.

Community Response and Token Cost Considerations

Once shared, the community response was extremely enthusiastic. Many users immediately reported "this is insane" after trying it, calling for it to be made into an official plugin. OpenAI's president personally reposted and liked it after seeing it, demonstrating the level of recognition.

Community calls for Codex automation to be made into a plugin

Of course, some raised practical concerns: reviewing 30 days of history — how many tokens and credits does that consume? For regular users, is this expense worthwhile?

Quantifying Token Costs: Tokens are the basic unit of measurement for text processing in large language models — roughly, each English word corresponds to about 1-1.5 tokens, and each Chinese character corresponds to about 1-2 tokens. Looking back through 30 days of conversation history, if each day averages 500-1,000 tokens of interaction content, the 30-day cumulative total is approximately 15,000-30,000 input tokens. Adding Codex's analysis and Skill generation output tokens, a single audit's total consumption could be in the 50,000-100,000 token range. Based on GPT-4o's current pricing (approximately $2.5/million input tokens, $10/million output tokens), a single complete audit would cost roughly $0.5-$2 USD, which is acceptable for heavy users. However, if executed frequently or with larger historical data volumes, costs scale linearly — this is the fundamental reason community users raised the practical question of "whether the token expense is worthwhile."

However, the person who shared this didn't directly address the question — as an OpenAI internal employee, token consumption probably isn't something they need to worry about.

Other Advanced Codex Techniques from This Employee

This OpenAI employee regularly experiments and frequently shares advanced Codex usage tips on social media. For example:

Using Codex to configure a Raspberry Pi: Ensuring the device can be remotely accessed after connecting to home WiFi
Loop command mode: Defining a completion state for Codex, telling it "what success looks like," then having it loop execution until the goal is achieved

Loop Command Mode and Goal-Oriented Execution: Loop command mode (also called "goal-oriented loop execution") is an important task execution paradigm in AI agent systems, with its core concept derived from feedback loops in cybernetics. Traditional AI usage follows a single request-response pattern, while Loop mode has the agent continuously execute a "perceive → plan → act → evaluate" cycle until preset termination conditions (i.e., "what success looks like") are met. This is highly similar to policy optimization in reinforcement learning: the agent evaluates the gap between the current state and the goal state after each iteration, adjusting its next action accordingly. In engineering practice, this pattern is widely applied in automated testing, CI/CD (Continuous Integration/Continuous Deployment) pipelines, and complex multi-step data processing tasks. Codex's Loop mode brings this engineering concept to the natural language interaction layer, enabling non-technical users to define and drive complex automated workflows.

OpenAI employee frequently shares Codex usage experiences

The common thread across these techniques: they don't treat Codex as a simple Q&A tool, but as an intelligent agent with autonomous judgment and continuous execution capabilities.

Implications for Regular Users: Redefining AI's Role

The biggest takeaway from this case is: The upper limit of an AI tool's value depends on how you define its role.

Most people still use AI in a "question and answer" fashion, but this employee's approach lets AI proactively audit your behavior patterns and tell you which work can be optimized. This "meta-level" usage mindset is the real key to unlocking AI productivity.

If you're also using Codex, try this approach: instead of manually submitting requests each time, first let AI understand the full picture of your work, then have it propose optimization suggestions. Behind those 35 lines of prompts lies an entirely new paradigm of human-AI collaboration.

Key Takeaways

An OpenAI employee used 35 lines of prompts to have Codex automatically analyze 30 days of historical data, identify repetitive work, and package it into reusable automated Skills
Codex intelligently judges whether tasks are worth automating, requiring conditions like repeated occurrence and process stability before execution
Combined with screen reading and long-term memory, Codex can capture users' repetitive behavior patterns beyond the chat interface
The approach isn't limited to programmers — operations, planning, writing, and other roles can all benefit
The core insight is transforming AI from a passive Q&A tool into an intelligent agent that proactively audits work habits

35 Lines of Prompts Let Codex Auto-Optimize Your Workflow — Reposted by OpenAI's President

Core Principle: Let AI Audit Your Work Habits

Screen Reading + Long-Term Memory: Breaking Beyond the Chat Box

Community Response and Token Cost Considerations

Other Advanced Codex Techniques from This Employee

Implications for Regular Users: Redefining AI's Role

Key Takeaways

Related articles

Cursor + Codex Dual-IDE Collaboration: A Practical Methodology for Open-Source Project Customization

Cursor Multi-Agent in Practice: Building a Full-Stack Next.js Blog in 50 Minutes

Building an AI Software Factory from Scratch: A Cursor Engineer's Hands-On Experience with Multi-Agent Collaboration