编程语言
首页 > 编程语言> > 弹性映射减少超时java.io.IOException:意外的流结束

弹性映射减少超时java.io.IOException:意外的流结束

作者:互联网

我正在运行MAP减少作业(弹性地图减少EMR)服务.该工作适用于小型数据集,但对大型数据集提供以下例外(文件大小400MB)

使用相同的大输入文件运行另一个作业工作正常但是.为什么?

Error: java.io.IOException: Unexpected end of stream pos=19520590 contentLength= 458603265
    at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:313)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at java.io.DataInputStream.read(DataInputStream.java:100)
    at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
    at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:433)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

Error: java.io.IOException: Unexpected end of stream pos=222629174 contentLength= 458603265
    at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:313)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at java.io.DataInputStream.read(DataInputStream.java:100)
    at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
    at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:433)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

Error: java.io.IOException: Unexpected end of stream pos=155898310 contentLength= 458603265
    at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:313)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at java.io.DataInputStream.read(DataInputStream.java:100)
    at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
    at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:433)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

Error: java.io.IOException: Unexpected end of stream pos=88645447 contentLength= 458603265
    at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:313)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at java.io.DataInputStream.read(DataInputStream.java:100)
    at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
    at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:433)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

Error: java.io.IOException: Unexpected end of stream pos=19520590 contentLength= 458603265
    at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:313)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at java.io.DataInputStream.read(DataInputStream.java:100)
    at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
    at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:433)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

Error: java.io.IOException: Unexpected end of stream pos=291259038 contentLength= 458603265
    at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:313)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at java.io.DataInputStream.read(DataInputStream.java:100)
    at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
    at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:433)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

Error: java.io.IOException: Unexpected end of stream pos=222629174 contentLength= 458603265
    at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:313)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at java.io.DataInputStream.read(DataInputStream.java:100)
    at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
    at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:433)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

Error: java.io.IOException: Unexpected end of stream pos=155898310 contentLength= 458603265
    at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:313)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at java.io.DataInputStream.read(DataInputStream.java:100)
    at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
    at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:433)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

Error: java.io.IOException: Unexpected end of stream pos=88645447 contentLength= 458603265
    at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:313)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at java.io.DataInputStream.read(DataInputStream.java:100)
    at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
    at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:433)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

Error: java.io.IOException: Unexpected end of stream pos=19520590 contentLength= 458603265
    at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:313)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at java.io.DataInputStream.read(DataInputStream.java:100)
    at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
    at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:433)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

Error: java.io.IOException: Unexpected end of stream pos=291259038 contentLength= 458603265
    at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:313)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at java.io.DataInputStream.read(DataInputStream.java:100)
    at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
    at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:433)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

Error: java.io.IOException: Unexpected end of stream pos=222629174 contentLength= 458603265
    at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:313)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at java.io.DataInputStream.read(DataInputStream.java:100)
    at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
    at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:433)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

Error: java.io.IOException: Unexpected end of stream pos=155898310 contentLength= 458603265
    at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:313)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at java.io.DataInputStream.read(DataInputStream.java:100)
    at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
    at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:433)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

Error: java.io.IOException: Unexpected end of stream pos=88645447 contentLength= 458603265
    at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:313)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at java.io.DataInputStream.read(DataInputStream.java:100)
    at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
    at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:206)
    at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:45)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:433)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

我跑了另一个类似配置的工作,它工作得很好.任何输入?

解决方法:

我遇到了类似火花蜂巢的问题.问题是我发出了一个hive查询来写入s3中的一个位置,由一个数据帧支持,然后立即从中读取.就像是:

statusDF.createOrReplaceTempView("statusDF")
spark.sql(s"INSERT OVERWRITE TABLE ${table} SELECT id, last_updated FROM statusDF")
statusDF.show(5)

所以有一种可能性是你在写入同一个S3位置(但尚未完成)的其他内容中进行读取,这就产生了这个错误.

标签:java,hadoop,elastic-map-reduce
来源: https://codeday.me/bug/20190708/1406341.html