Motivation
The DLR Institute of Data Sciences focuses on finding solutions for the new challenges of the digital era. The focus is on data management, IT security, smart systems and citizen sciences. This also includes planning processes in space travel. The planning of satellites is tied to technical products from suppliers. Products available on the market are often represented by technical descriptions in the form of PDF files. This unstructured data source cannot be searched for product properties, and the product type cannot be derived for a search.
Objective
- Developing a method to convert text documents (technical component descriptions) into structured data
- Deriving a research tool that works with specific characteristics instead of text components
Results
- Semantic research in component descriptions based on PDF collections
- Added value: Simplified embedding of documents in planning processes