java – 如何使用VTDGenHuge将大型xml拆分成小块?
作者:互联网
我想将大型xml分成小块.我正在使用VTDGen将xml文件拆分成小块,它适用于文件大小< 2 GB. VTD-xml使用IN-Memory来解析xml,我不想将xml加载到内存中.所以我试图使用VTDGenHuge映射内存. 代码适用于VTDGen,但是当我使用VTDGenHuge它不起作用.
String prefix = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"+"\n";
String suffix = "\n</Employees>\n";
try {
VTDGenHuge vg = new VTDGenHuge();
if (vg.parseFile("C:\\Users\\abc\\Desktop\\latestxml\\Input_1.xml", true,VTDGenHuge.MEM_MAPPED)) {
int splitBy = ;
System.out.println("Started time"+ new Date());
VTDNavHuge vn = vg.getNav();
AutoPilotHuge ap = new AutoPilotHuge(vn);
ap.selectXPath("/Employees/Employee");
FastLongBuffer flb = new FastLongBuffer(4);
int i;
byte[] xml = vn.getXML().getBytes();
while ((i = ap.evalXPath()) != -1) {
flb.append(vn.getElementFragment());
}
int size = flb.size();
if (size != 0) {
File fo = null;
FileOutputStream fos = null;
for (int k = 0; k < size; k++) {
if (k % splitBy == 0) {
if (fo != null) {
fos.write(suffix.getBytes());
fos.close();
fo = null;
}
}
if (fo == null) {
fo = new File("C:\\Users\\abc\\Desktop\\Test\\xml\\"+"out" + k + ".xml");
fos = new FileOutputStream(fo);
fos.write(prefix.getBytes());
}
fos.write("\n".getBytes());
fos.write(xml, flb.lower32At(k), flb.upper32At(k));
}
if (fo != null) {
fos.write(suffix.getBytes());
fos.close();
fo = null;
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
我得到NUll值为“byte [] xml = vn.getXML().getBytes();”
当你做syso vn.getXML()时,你得到了对象的价值.但是“getBytes()”返回null.我不知道为什么.但是如果你做“byteAt(x)”x =任何长值它返回值.
我的xml文件是:
<?xml version="1.0" encoding="UTF-8"?>
<Employees>
<Employee id="1">
<age>29</age>
<name>Pankaj</name>
<gender>Male</gender>
<role>Java Developer</role>
</Employee>
<Employee id="2">
<age>35</age>
<name>Lisa</name>
<gender>Female</gender>
<role>CEO</role>
</Employee>
<Employee id="3">
<age>40</age>
<name>Tom</name>
<gender>Male</gender>
<role>Manager</role>
</Employee>
<Employee id="1">
<age>29</age>
<name>Pankaj</name>
<gender>Male</gender>
<role>Java Developer</role>
</Employee>
<Employee id="2">
<age>35</age>
<name>Lisa</name>
<gender>Female</gender>
<role>CEO</role>
</Employee>
<Employee id="3">
<age>40</age>
<name>Tom</name>
<gender>Male</gender>
<role>Manager</role>
</Employee>
<Employees>
我想这样出来.
<?xml version="1.0" encoding="UTF-8"?>
<Employees>
<Employee id="1">
<age>29</age>
<name>Pankaj</name>
<gender>Male</gender>
<role>Java Developer</role>
</Employee>
<Employee id="2">
<age>35</age>
<name>Lisa</name>
<gender>Female</gender>
<role>CEO</role>
</Employee>
<Employee id="3">
<age>40</age>
<name>Tom</name>
<gender>Male</gender>
<role>Manager</role>
</Employee>
<Employees>
<?xml version="1.0" encoding="UTF-8"?>
<Employees>
<Employee id="1">
<age>29</age>
<name>Pankaj</name>
<gender>Male</gender>
<role>Java Developer</role>
</Employee>
<Employee id="2">
<age>35</age>
<name>Lisa</name>
<gender>Female</gender>
<role>CEO</role>
</Employee>
<Employee id="3">
<age>40</age>
<name>Tom</name>
<gender>Male</gender>
<role>Manager</role>
</Employee>
<Employees>
解决方法:
我认为扩展vtd-xml的vn.getXML()返回一个与标准vtd-xml不同的IbyteBuffer接口对象.你可以调用名为writeOutputToFile()的inteface方法并将偏移量和值参数传递给它..对不起它的文档部分缺乏,但这是基本的低级…
标签:java,xml,vtd-xml 来源: https://codeday.me/bug/20190711/1433446.html