Annotation and Decoding of Images in Patents

Annotazione e Decoding delle Immagini nei Brevetti

In the patent world, images are not just accompanying illustrations; they are an integral part of the document, often containing crucial information, albeit communicated more subtly. Unlike photographs or realistic representations, patent drawings, typically represented as wireframes or stylized sketches, serve a functional and technical purpose. These drawings don’t just visualize an invention, but they also incorporate details that support the technical interpretation of the patent text.

Patent images, often simple and stylized, are far removed from real photos for a good reason: they represent a form of “encoding” that makes the technical content of the invention clearer, but at the same time less accessible to untrained eyes. A patent is not merely a technical or mechanical drawing, and its figures do not strictly follow CAD or technical standards. Instead, they use their own visual language to represent components and functionalities of the invention. Just as the text of a patent is written in a specialized jargon (legal terminologies), the images follow their own “dialect” rarely found in other technical documents (manuals, specifications, etc.).

This is why innovative tools like Erre Quadro’s TagMyPic are emerging as essential solutions for navigating between text and images in patents. Thanks to integrated image and text analysis technologies, this tool quickly links patent text and images, greatly improving efficiency in both the analysis and drafting stages of patent documents. TagMyPic’s functionalities go beyond simple image annotation. Its navigation feature allows analysts to quickly move between descriptive text and associated images, facilitating the understanding and interpretation of the patent. The ability to move from figures to the texts describing them, from components to their functions and interfaces, makes navigation a truly multimodal experience that enhances both the efficiency and hopefully the effectiveness of the exploration.

The “Image Enhancement” feature allows users to enrich the image with personalized annotations, creating a direct link between text and image and increasing the level of detail and customization of the visual representation. This tool fosters direct interaction between design or research and development teams and intellectual property experts, enabling a dialogue made possible by tools that translate their respective dialects (or languages).

Even more advanced is the “Problem-Solution” feature, which offers an immediate visualization of the technical problems solved by the invention, along with the proposed solutions, making the technical impact of the invention clearer. When linked to images, this might even make Engineering Design experts think of a proto-Axiomatic Design!

Discover our AI software for the automatic extraction of information from technical documents.

The Fusion of Text and Images: A Step Beyond

Although multimodal technologies are gaining ground, two challenges remain open:

1. Multimodal systems (like CLIP) are trained on images sourced from the web, often in color or black and white, and struggle significantly when “digesting” wireframes (schematic representations, often three-dimensional, where only the main lines and contours are shown to define the structure).

2. There is still a significant gap between how humans seamlessly integrate information from text and images and how machines do it. In human understanding, text and images intuitively merge, creating a unified mental representation. In contrast, machines still treat text and images as two separate worlds, which are later combined using sophisticated techniques, such as multimodal approaches, to achieve a unified interpretation.

There is also another side to images in patents: in some cases, they are used to “hide” the true content of the invention, making it less immediately interpretable and more difficult to search for by competitors. For this reason, Erre Quadro has developed a series of strategies and techniques capable of unveiling the structure and functions hidden behind patent drawings. This approach allows for functional and structural queries directly on the images, enhancing the effectiveness of patent search and analysis.

The integration of text and images in patents represents a current challenge in improving the analysis and understanding of a complex technical document. Tools like TagMyPic are pushing the boundaries of what is possible in terms of patent navigation and annotation, offering analysts and attorneys unprecedented capabilities to decode and unlock the value of patent information.

If you have a use case you’d like to test or an innovative and challenging application, now is the perfect time to explore the new frontiers of patent analysis.

Request a demo of our AI tools.

Do you want to try our products?

Request a free demo by filling out the form.