
In recent years, the application of Natural Language Processing technologies and Large Language Models in the field of Intellectual Property has attracted growing interest, especially among tech companies that face constant pressure to innovate, protect their distinctive know-how, and maintain a competitive edge. While what LLMs are and how they work have been covered in previous articles, today we focus on LLMs applied to patents, outlining their limitations and how semantic analysis and machine learning for IP are undeniably valuable tools in this specific domain, even if they might seem “outdated” in light of the emergence of new AI technologies.
The strategic value of patent data
Patents are not just legal instruments; they are one of the richest sources of technological knowledge, capable of predicting trends even ten years in advance. They help understand where competitors are investing, which problems remain unsolved, and which solutions already exist, significantly reducing R&D waste. However, the sheer volume of data, its uneven quality, and the complexity of technical language make analysis extremely difficult. AI in intellectual property management promises to simplify this scenario, but in practice, the complexity and volume of patent data require a more structured approach.
What are the limitations of LLMs applied to IP?
Large Language Models have revolutionized the ability to process natural language, enabling advanced queries, generative summarization, and large-scale semantic analysis. However, when applied to patents, structural limitations emerge. LLMs are statistical models that learn correlations and patterns between words without actually understanding the underlying technical concepts. This leads to the phenomenon of “hallucinations,” i.e., confidently generated answers without solid foundations. It is estimated that in patent retrieval tasks, the rate of significant errors ranges between 17% and 33%, an unacceptable level for analyses that affect high-impact investments, strategies, and decisions[1].
Another limitation concerns the ability of LLMs to correctly interpret the technical function of solutions described in patents. Engineering language is based on causal relationships and physical principles. LLMs, by contrast, operate on probabilistic associations and are not able to intrinsically understand the underlying physical mechanisms. Additionally, AI can produce different responses to similar inputs, without offering clear justification that allows the logical process followed to be reconstructed. Models also fail when it comes to detecting “weak signals,” thereby losing strategically relevant information[2].
An additional challenge concerns the economics of AI projects. Agentic systems, designed to automate complex sequences of operations, are proving costly to develop and maintain, with uncertain financial returns. According to Gartner, over 40% of agentic AI projects risk being abandoned by 2027 for this reason. This demonstrates how difficult it is to entrust strategic and complex processes to tools that cannot offer solid reliability guarantees.
The hybrid approach: a more reliable model
The solution is not to abandon LLMs, but to integrate them into a more robust approach that combines artificial and human intelligence. This was demonstrated through a patent landscape analysis test based on three distinct methodologies:
- The first, common and applicable by many but fundamentally incorrect, since an LLM is not a search engine operating on a complete technical database (such as patents or scientific articles), uses a generalist AI that performs web searches (e.g., ChatGPT Search) without expert supervision.
- The second involves AI guided by a domain expert.
- The third is based on models trained on Erre Quadro’s proprietary databases, integrated with a human-in-the-loop approach to ensure checks, verifications, and the ability to confidently stop the analysis when results are complete, relevant, and free of noise.
The chosen technological domain was terrestrial rovers for logistics, autonomous vehicles designed to move materials, products, or goods in industrial or outdoor environments without direct human intervention. Here’s what emerged.
The results
When asked about the most profitable R&D directions in the sector, the generalist AI, not actually searching a patent database, returned a generic narrative based on content found online, without verifiable data, numbers, or technical evidence. The descriptions were plausible but lacked grounding in real data, a clear sign of the unreliability of language models when used without context.
“In recent years, patent innovation has focused on compact autonomous rovers for last-mile deliveries, designed to move on sidewalks, bike lanes, and company courtyards, with particular attention to safe urban navigation and intelligent package handling (PatentPCSpringerOpen). Key patented technologies include integrated lidar/vision systems to avoid pedestrian and vehicular obstacles, and mixed operational modes (autonomous and remotely controlled) for critical situations (Google Patents, Supply Chain Dive)…”
[1] Magesh, V., Surani, F., Dahl, M., Suzgun, M., Manning, C. D., & Ho, D. E. (2024). Hallucination‑Free? Assessing the Reliability of Leading AI Legal Research Tools. Stanford Human‑Centered AI
[2] Ikoma, H., & Mitamura, T. (2025). Can AI Examine Novelty of Patents?: Novelty Evaluation Based on the Correspondence between Patent Claim and Prior Art
Discover our AI software for the automatic extraction of information from technical documents.
The expert-guided AI takes a step forward because the human provides more precise instructions, defines the boundaries of the analysis, and directs the model toward specific topics. However, even in this case, the results remain incomplete, as the system deduces responses by reprocessing the information it received during training and from the web, rather than extracting it from a complete and updated database. Numerical estimates are approximate (“hundreds of patent families”), geographic distribution is imprecise, and the identification of the main assignees does not reflect the reality of the sector, which involves actors of very different scale and nature compared to those cited by generalist models. Technical classifications, finally, are partial, unsystematic, and unable to represent the entire spectrum of technological challenges.
“From 2017 to 2024, patents on autonomous rovers have increased, especially between 2019 and 2023, highlighting a rapidly developing sector. The solutions mainly involve L4–L5 rovers operating on sidewalks and pedestrian areas, with real-world trials conducted on campuses and in cities of countries such as the United Kingdom, the United States, South Korea, and Japan. Among the most active players are Starship Technologies, Amazon Scout, and various universities and startups, contributing to the advancement of navigation, sensing systems, and compartment management. Analysis of the technical problems addressed also shows a varied but coherent picture: most patents focus on sidewalk navigation and safe street crossing, followed by solutions for mapping and visual localization, multi-compartment management, and robot-box interfaces for loading and unloading. Other emerging themes include the management of indoor and outdoor environments, obstacle resilience, and package handling, while a growing niche is dedicated to temperature control for food or pharmaceutical deliveries…”
Thanks to AI trained on proprietary databases updated weekly with the most recent documents and on a constantly expanded domain entity repository, it becomes possible to build a tool capable of recognizing all and only the truly relevant documents, enabling truly transformative results, especially when integrated into a human-in-the-loop approach.
The results first provided a precise quantification of the analyzed patent set, composed of 15,787 publications and 4,961 DOCDB families. Geographic distribution was correctly reconstructed, and the ranking of assignees clearly emerged.

The classification of technical problems was also richer and more articulated. AI trained on proprietary databases can distinguish functions mentioned in patents following engineering logic. The results below are organized by problems and subproblems addressed by the various inventions, along with their frequency and relevance, measured based on occurrence over the last five years.

This approach allowed identification of information missing in other methods, such as the use of wireless connections, management of unpredictable environmental circumstances, collision avoidance, and traffic control—evidence of higher accuracy and precision in the analysis.
From data to decision: why the hybrid model is indispensable
The case study shows that the difference between a generalist model and an IP-specific model guided by human expertise is not only quantitative but qualitative. While generalist AI creates a synthesis that may be useful in brainstorming, it is not suitable for strategic decisions. The hybrid approach offers a sector representation based on accurate, up-to-date, structured, and verifiable data. The test highlights how only such a model can recover not only what is obvious and frequent, but also what is rare or emerging, such as the weak signals that anticipate future innovation. The ability to capture these “tails” of insight is crucial for identifying technological areas where new solutions with real patent potential can be positioned.
Want to learn more? Download the presentation with the case study!
Conclusions
Patent analysis requires technical expertise, data reliability, and interpretive capacity. LLMs, when used alone, cannot guarantee these requirements. Value emerges from the synergy of advanced tools, proprietary databases, and human supervision: this combination transforms AI into a strategic ally, capable of accelerating analysis without sacrificing accuracy. The case study demonstrates that only a hybrid approach allows tech companies to clearly identify opportunities, accurately assess risks, and define truly informed IP strategies.
Request a demo of our AI tools.


