Science
OOPS: Automated generation of REST API specification via LLMs
Key Points
arXiv:2601.12735v2 Announce Type: replace Abstract: REST APIs, based on the REpresentational State Transfer (REST) architecture, are the primary type of Web API. The OpenAPI Specification (OAS) serves as the de facto standard for describing REST APIs and is crucial for multiple software engineering tasks. Automated OAS generation can help developers identify and correct issues in manually maintained OAS, but existing approaches rely on technology-specific rules and human expert intervention.
arXiv:2601.12735v2 Announce Type: replace
Abstract: REST APIs, based on the REpresentational State Transfer (REST) architecture, are the primary type of Web API. The OpenAPI Specification (OAS) serves as the de facto standard for describing REST APIs and is crucial for multiple software engineering tasks. Automated OAS generation can help developers identify and correct issues in manually maintained OAS, but existing approaches rely on technology-specific rules and human expert intervention. LLMs' powerful code understanding capabilities offer the potential to overcome these limitations, but introduce additional challenges such as context length limitations and hallucinations. To address these challenges, we propose OOPS, the first technology-agnostic approach that leverages LLM-based static analysis of server code for OAS generation. Through an LLM agent workflow comprising two key steps, endpoint method extraction and OAS generation, OOPS eliminates the need for technology-specific rules or human expert intervention. By constructing an API dependency graph, it establishes necessary file associations to address LLMs' context length limitations. By multi-stage generation and self-refine, it mitigates both syntactic and semantic hallucinations during OAS generation. We evaluated OOPS on 12 real-world REST APIs spanning 5 programming languages and 8 development frameworks. Experimental results demonstrate that OOPS accurately generates high-quality OAS for REST APIs implemented with diverse technologies, achieving an average F1-score exceeding 98% for endpoint method inference, 97% for both request parameter and response inference, and 92% for parameter constraint inference. The input tokens average below 5.6K with a maximum of 16.13K, while the output tokens average below 0.9K with a maximum of 7.63K.