You are viewing an old version of this page. View the current version.
Compare with Current
View Page History
« Previous
Version 2
Next »
The Slide Extractor is a component that detects a pptx and parse/extract the PPTX slides using Apache Tika:
- Extracting text content from PPTX slides.
- Extracting metadata such as slide title, author, created date, and modified date.
- Configurable max characters file size for processing large PPTX files.
- Configurable timeout for parsing process