Java实现PDF转HTML/Word/Excel/PPT/PNG的示例代码

maven 下载 aspose.pdf

通过将以下配置添加到 pom.xml, 您可以直接从基于maven的项目 轻松地使用aspose.pdf for java 。

<repository>
    <id>asposejavaapi</id>
    <name>aspose java api</name>
    <url>https://repository.aspose.com/repo/</url>
</repository>
<dependency>
    <groupid>com.aspose</groupid>
    <artifactid>aspose-pdf</artifactid>
    <version>22.4</version>
</dependency>

核心代码实现(单类)

import com.aspose.pdf.document;
import com.aspose.pdf.saveformat;
import com.aspose.pdf.devices.pngdevice;
import com.aspose.pdf.devices.resolution;
import java.io.*;
public class pdfhelper3 {
public static void main(string[] args) throws ioexception {
pdf2image("c:\\users\\liuya\\desktop\\pdf\\示例文件.pdf");
}
//转word
public static void pdf2word(string pdfpath) {
long old = system.currenttimemillis();
try {
string wordpath=pdfpath.substring(0,pdfpath.lastindexof("."))+".docx";
fileoutputstream os = new fileoutputstream(wordpath);
document doc = new document(pdfpath);
doc.save(os, saveformat.docx);
os.close();
long now = system.currenttimemillis();
system.out.println("pdf 转 word 共耗时:" + ((now - old) / 1000.0) + "秒");
} catch (exception e) {
system.out.println("pdf 转 word 失败...");
e.printstacktrace();
}
}
//转ppt
public static void pdf2ppt(string pdfpath) {
long old = system.currenttimemillis();
try {
string wordpath=pdfpath.substring(0,pdfpath.lastindexof("."))+".ppt";
fileoutputstream os = new fileoutputstream(wordpath);
document doc = new document(pdfpath);
doc.save(os, saveformat.pptx);
os.close();
long now = system.currenttimemillis();
system.out.println("pdf 转 ppt 共耗时:" + ((now - old) / 1000.0) + "秒");
} catch (exception e) {
system.out.println("pdf 转 ppt 失败...");
e.printstacktrace();
}
}
//转excel
public static void pdf2excel(string pdfpath) {
long old = system.currenttimemillis();
try {
string wordpath=pdfpath.substring(0,pdfpath.lastindexof("."))+".xlsx";
fileoutputstream os = new fileoutputstream(wordpath);
document doc = new document(pdfpath);
doc.save(os, saveformat.excel);
os.close();
long now = system.currenttimemillis();
system.out.println("pdf 转 excel 共耗时:" + ((now - old) / 1000.0) + "秒");
} catch (exception e) {
system.out.println("pdf 转 excel 失败...");
e.printstacktrace();
}
}
//转html
public static void pdf2html(string pdfpath) {
long old = system.currenttimemillis();
try {
string htmlpath=pdfpath.substring(0,pdfpath.lastindexof("."))+".html";
document doc = new document(pdfpath);
doc.save(htmlpath,saveformat.html);
long now = system.currenttimemillis();
system.out.println("pdf 转 html 共耗时:" + ((now - old) / 1000.0) + "秒");
} catch (exception e) {
system.out.println("pdf 转 html 失败...");
e.printstacktrace();
}
}
//转图片
public static void pdf2image(string pdfpath) {
long old = system.currenttimemillis();
try {
resolution resolution = new resolution(300);
string datadir=pdfpath.substring(0,pdfpath.lastindexof("."));
file imagedir = new file(datadir+"_images");
imagedir.mkdirs();
document doc = new document(pdfpath);
pngdevice pngdevice = new pngdevice(resolution);
for (int pagecount = 1; pagecount <= doc.getpages().size(); pagecount++) {
outputstream imagestream = new fileoutputstream(imagedir+"/"+pagecount+".png");
pngdevice.process(doc.getpages().get_item(pagecount), imagestream);
imagestream.close();
}
long now = system.currenttimemillis();
system.out.println("pdf 转 png 共耗时:" + ((now - old) / 1000.0) + "秒");
} catch (exception e) {
system.out.println("pdf 转 png 失败...");
e.printstacktrace();
}
}
}

运行方法,idea里右键运行,如果要做成web系统可以将代码封装程web服务,调用方法就行。

转换文件结果

以一个十四的pdf文件转化为例,大部分转换时间在10-12s,只有转ppt花费的时间久一点需要20s.可能pdf里面不是表格类的内容,所以转换excel文件后,样式差别会有点大,其他文件转换后样式和之前是保持一样的。

以上就是java实现pdf转html/word/excel/ppt/png的示例代码的详细内容。

Git配置别名简化操作命令方式详解 生活杂谈

Git配置别名简化操作命令方式详解

Git 中有些操作命令比较长,单词多,不容易记忆。例如把一个dev开发分支合并到master分支,就需要敲:git merge --no-ff -m "提交合并" dev 这么长的命令。如果git命令...