Package com.chatmotorapi.api.util
Class FileToTextConverter
java.lang.Object
com.chatmotorapi.api.util.FileToTextConverter
public class FileToTextConverter extends Object
A utility class for converting various file formats to plain text files.
This class supports HTML, DOC, DOCX, RTF, and PDF files by extracting text content
and saving it into a plain text file. The output encoding is UTF-8 by default.
Note: This class is not designed to handle gigantic files, as the content will be loaded into memory during the conversion process.
-
Constructor Summary
Constructors Constructor Description FileToTextConverter()
-
Method Summary
Modifier and Type Method Description static void
convert(String inputFilePath, String outputFilePath)
Converts an HTML, DOC, DOCX, RTF, or PDF file to a text file by extracting text content and dumping it into a plain text file.static void
convert(String inputFilePath, String outputFilePath, Charset charset)
Converts an HTML, DOC, DOCX, RTF, or PDF file to a text file by extracting text content and dumping it into a plain text file.
-
Constructor Details
-
FileToTextConverter
public FileToTextConverter()
-
-
Method Details
-
convert
Converts an HTML, DOC, DOCX, RTF, or PDF file to a text file by extracting text content and dumping it into a plain text file. The encoding used for the output file is UTF-8.- Parameters:
inputFilePath
- the input file path of the file to be convertedoutputFilePath
- the output file path where the converted text will be dumped- Throws:
IOException
- if any error occurs during the conversion process
-
convert
public static void convert(String inputFilePath, String outputFilePath, Charset charset) throws IOExceptionConverts an HTML, DOC, DOCX, RTF, or PDF file to a text file by extracting text content and dumping it into a plain text file.- Parameters:
inputFilePath
- the input file path of the file to be convertedoutputFilePath
- the output file path where the converted text will be dumpedcharset
- the charset of the encoding to be used for the output file- Throws:
IOException
- if any error occurs during the conversion process
-