Semi-AI Mode: A Pragmatic Approach to API Automation Testing Frameworks

Introduction: The Current State of AI in Testing

After more than a year of development, AI applications in software testing have gradually matured. In functional testing, AI can help us analyze requirements, generate test cases, and submit bugs—these relatively simple scenarios have been proven viable. However, when we enter the realm of API automation testing, things become much more complex.

This article explores a "semi-AI" approach to API automation testing, which is considered a relatively mature and reliable practice path at the current stage.

AI-Driven Automation Testing Tutorial

AI Applications in Functional Testing Have Matured

Two Mainstream Approaches to Test Case Generation

In functional testing, AI applications mainly focus on two directions:

Approach 1: Generating test points from prototypes. By providing AI with product prototypes (such as login pages, repayment interfaces, loan processes, etc.), it can automatically analyze page layouts and functional modules to generate corresponding test points. The prototypes mentioned here are typically interactive design drafts created with tools like Axure or Figma, containing page element layouts, interaction logic, and business processes. Multimodal large language models (such as GPT-4V, Claude, etc.) can directly "understand" these images, identify input fields, buttons, dropdown menus, and other UI elements, and automatically derive test points based on common test design methods (equivalence partitioning, boundary value analysis, etc.). For example, given a prototype of an installment repayment page, AI can generate comprehensive test points including page layout verification, loan amount validation, loan term selection, repayment method switching, repayment schedule display, bottom button functionality, and overall process consistency testing.

Approach 2: Generating test cases from requirement documents. By feeding requirement documents to AI, it can generate standardized test cases including case numbers, module names, case titles, priority levels, preconditions, operation steps, and expected results. In practice, AI-generated cases often exceed the format standardization of many manually written cases by testers.

Limitations of AI in Functional Testing

You may not have noticed that AI has output length limitations (typically no more than 2000-4000 tokens), which means we cannot feed an entire requirement document to AI for processing at once. This is a fundamental constraint of current AI-assisted testing.

It's important to understand that tokens are the basic units through which large language models process text—a single Chinese character typically corresponds to 1-2 tokens. Current mainstream models like GPT-4 have a single output limit of approximately 4096-8192 tokens (about 2000-4000 Chinese characters), and while input context windows have expanded to 128K or even longer, models exhibit "attention decay" when processing extremely long texts, meaning their ability to understand and remember middle portions significantly decreases. This means that even though it's technically possible to input large amounts of text, AI's effective processing capacity remains limited, requiring manual text segmentation and task decomposition.

The Core Challenge of API Testing: Why Pure AI Approaches Don't Work

Multiple Challenges in API Testing

Compared to functional testing, API automation testing involves far more complex problems:

API documentation issues: In practice, API documentation may not exist, may be incomplete, or may have unclear descriptions. If humans can't understand it, AI certainly can't either. Ideally, development teams should use Swagger/OpenAPI specifications to maintain API documentation—this machine-readable format is much more AI-friendly. However, in reality, many projects still have API documentation as unstructured descriptions in Word or Confluence pages, or even existing only in verbal communication between developers.
Parameter type diversity: API parameters include four major categories: Params (query parameters appended to the URL), Data (form data submitted in application/x-www-form-urlencoded format), JSON (structured data submitted in application/json format), and File (file uploads using multipart/form-data encoding). Getting AI to accurately identify and distinguish these four parameter types is not easy, because different parameter types have fundamental differences in request construction, encoding methods, and Content-Type settings—confusing any one of them will cause the API call to fail.
API correlation and dependencies: Data transfer and dependency relationships exist between APIs. API correlation is one of the most critical technical challenges in API automation testing. Typical scenarios include: the token returned by the login API needs to be passed to all subsequent business APIs for authentication; the order ID returned by the create order API needs to be passed to the payment API. Common implementation approaches include: using a global variable pool to store intermediate data, extracting fields from responses via regular expressions or JSONPath, leveraging pytest's fixture mechanism for data sharing, or implementing dynamic replacement through placeholder syntax (such as ${token}) in YAML test cases. The longer the correlation chain, the higher the maintenance cost—this is one of the reasons pure AI approaches struggle.
Security mechanisms: Different security rules such as encryption, authentication, and signatures require targeted handling. Common security mechanisms include: OAuth 2.0 authorization flows, JWT (JSON Web Token) verification, API Key authentication, HMAC signature validation, RSA/AES data encryption, etc. Each mechanism has different implementation logic and often involves details like key management, timestamp synchronization, and random number generation—all of which require testers to deeply understand before correctly encapsulating them into the framework.

Why Current AI Solutions on the Market Are Inadequate

Two common approaches currently have obvious problems:

Approach 1: AI generates automation scripts, then humans modify them. In practice, if too much modification is needed, the efficiency gain is negligible—it may even be less efficient than writing the code yourself. This is because AI-generated code often lacks understanding of project-specific context—it doesn't know your framework's encapsulation standards, doesn't understand the business logic relationships between APIs, and cannot handle project-specific encryption and signing algorithms. The resulting code has a significant "semantic gap" from actual requirements.

Approach 2: AI combined with platform development. Since the platform itself isn't stable enough, adding AI's uncertainty on top results in tools that are completely unusable in actual work.

The core issue is: these solutions cannot truly be implemented in production. If they can't be put into practice, no matter how impressive the technology looks, it has no practical value.

The Semi-AI Approach: A Complete Framework Design Philosophy

What Is the Semi-AI Approach?

The "semi-AI" approach means the framework is built and maintained by testers themselves, with AI providing assistance at specific stages. This approach acknowledges a reality: it's impossible to hand all problems over to AI—it simply doesn't have that capability yet.

This philosophy aligns with the "human-machine collaboration" concept in software engineering—assigning highly deterministic parts that require precise control to humans, while delegating highly repetitive, pattern-based parts to AI for acceleration. This ensures system reliability while maximizing efficiency gains.

Preparation Before Building the Framework

Before getting started, several key questions need to be clarified:

1. Protocol Type Confirmation

HTTP protocol
WebService protocol
Dubbo protocol

Different protocols require completely different framework encapsulation approaches. HTTP is currently the most mainstream Web API communication protocol, based on a request-response model, transmitting data in JSON or XML format—lightweight and easy to debug. WebService is an earlier distributed service architecture based on SOAP protocol and WSDL description language, using XML for data exchange, commonly found in traditional enterprise systems like banking and insurance—its message structure is complex but highly standardized. Dubbo is a high-performance RPC framework open-sourced by Alibaba, primarily used for internal calls between Java microservices, employing binary serialization for transmission—far superior in performance to HTTP but with a higher debugging threshold. The three protocols differ significantly in request construction, parameter encapsulation, and response parsing, so framework design must include targeted low-level encapsulation for each.

2. API Scale and Business Analysis

How many APIs are there?
Is it single API testing or business flow testing?
What is the business complexity?

Single API testing focuses on the input/output correctness of individual APIs, typically using data-driven approaches to cover various parameter combinations. Business flow API testing focuses on end-to-end business scenarios after chaining multiple APIs together, such as a complete flow of "register → login → place order → pay → query order." The two differ fundamentally in case design, data management, and assertion strategies, and the framework needs to support both modes.

3. Request Four Elements

Request method (GET/POST/PUT/DELETE, etc.)
Request path
Request parameters (four types: Params/Data/JSON/File)
Request headers

4. Technology Stack Discussion

Tool-based approach: Postman, Apifox, etc.
Code-based approach: Python framework, Java framework
Platform-based approach: Internal platform, paid platform, self-developed platform
Whether to incorporate AI

In code-based approaches, the Python ecosystem has become the mainstream choice for API automation testing due to its rich testing library support (pytest + requests + allure) and relatively low learning curve. Java solutions (such as RestAssured + TestNG) are more common in large enterprises, especially when the system under test is itself Java-based, making it more convenient to reuse utility classes and data models from the project.

Core Problems the Framework Needs to Solve

A complete API automation testing framework needs to address the following problems:

Problem Category	Specific Content
Case Writing	YAML/Excel/CSV format selection, writing methods for single API and flow cases
Case Reading	Individual or batch reading, execution strategy
Request Sending	Unified sending via Requests library, single or batch
Result Assertion	Verifying the correctness of returned results
API Correlation	Data transfer solutions between upstream and downstream APIs
Cross-Environment Execution	Switching between test/production/development environments
Security Handling	Encapsulation of encryption, signing, and other mechanisms
Logging System	Comprehensive log recording and tracing
Report Customization	Personalized Allure report configuration
CI/CD Integration	Jenkins continuous integration and unattended execution

Regarding the Requests library, it is the most popular HTTP client library in Python, renowned for its clean and elegant API design. It supports all HTTP methods including GET/POST/PUT/DELETE, with built-in Session management, Cookie handling, SSL verification, file upload, and more. In API automation testing, Requests is typically wrapped with secondary encapsulation to uniformly handle request header injection, response logging, exception retry, and other common logic. Combined with the pytest testing framework, YAML data-driven approach, and Allure report generation, it can build a complete API automation testing solution.

Regarding the Allure reporting framework, it is a lightweight multi-language test reporting framework originally developed by Qameta Software, supporting integration with mainstream testing frameworks like pytest, JUnit, and TestNG. The reports it generates feature rich visualization effects, including categorized test case display, execution timelines, failure screenshot attachments, test step details, and historical trend charts. In enterprise-level API automation projects, Allure reports are typically customized with environment information, custom severity level tags, and integration with defect management system links, making test results intuitively presentable to development teams and management.

Regarding CI/CD and Jenkins continuous integration, CI/CD (Continuous Integration/Continuous Delivery) is a core practice in modern software engineering. Its philosophy is to deliver code changes to production environments quickly and reliably through automated pipelines. Jenkins is the most widely used open-source CI/CD tool, supporting complete build, test, and deployment process definitions through Pipeline scripts. In API automation testing scenarios, Jenkins is typically configured as: code commit triggers automated test execution → generates Allure report → pushes test results via email or enterprise messaging (WeChat Work/DingTalk) → automatically creates defect tickets on failure. This "unattended" execution mode is the key to maximizing the value of API automation testing.

Pragmatic Advice: What AI Can Do vs. What Humans Should Do

Defining Clear Boundaries

In the semi-AI mode, the division of labor should be clear:

Testers are responsible for:

Framework architecture design and construction
Core logic encapsulation (correlation handling, encryption/signing, etc.)
Environment configuration and CI/CD integration
Test case design for complex business scenarios

AI assists with:

Generating basic test cases from API documentation (e.g., converting Swagger documents into YAML-format test case templates)
Rapid code snippet generation and completion (e.g., auto-generating function implementations from comments, completing assertion logic)
Test data construction (generating test data conforming to specific rules such as phone numbers, ID numbers, bank card numbers)
Simple script initialization (e.g., quickly generating test file skeletons for new modules based on project templates)

Key Insights

The core philosophy of this semi-AI approach is: The framework is infrastructure that must be kept stable and reliable by humans; AI is an efficiency tool that accelerates work on the premise of a stable framework.

Don't expect AI to replace the systematic thinking of test engineers, but don't ignore its efficiency advantages in repetitive work either. Finding this balance is the correct way to leverage AI-assisted testing at the current stage. Based on practical data, properly applying the semi-AI mode can improve API automation test case writing efficiency by 30%-50%, but the prerequisite is that the framework itself is robust enough to provide a stable execution environment for AI-generated content.

Conclusion

The semi-AI mode is not a compromise but a pragmatic choice based on current AI capability boundaries. As large model capabilities continue to improve—particularly in code understanding, context memory, and tool invocation (Function Calling/MCP)—the proportion of work AI can handle will gradually increase. However, in the foreseeable future, test engineers' command of frameworks and understanding of business logic remain irreplaceable core competencies.

It's worth noting that with the development of AI Agent technology, intelligent agents capable of autonomously executing multi-step testing tasks may emerge in the future. However, this requires solving three core problems: reliability, explainability, and controllability. Until then, the semi-AI mode will continue to serve as the best practice for API automation testing.