Java.io.IOException : ensureRemaining: Only 0 bytes remaining, trying to read 1
I’m having some issues with custom classes in giraph. I made VertexInput and Output formats, but I always get the following error:
java.io.IOException: ensureRemaining: Only * bytes remaining, trying to read *
Have a different value where “*” is located.
This was tested on a single-node cluster.
This
issue occurs when vertexIterator executes next() and there are no more vertices. This iterator is called from the flush method, but I basically don’t understand why the “next()” method fails. Here are some logs and classes….
My logs are as follows:
15/09/08 00:52:21 INFO bsp. BspService: BspService: Connecting to ZooKeeper with job giraph_yarn_application_1441683854213_0001, 1 on localhost:22181
15/09/08 00:52:21 INFO zookeeper. ZooKeeper: Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
15/09/08 00:52:21 INFO zookeeper. ZooKeeper: Client environment:host.name=localhost
15/09/08 00:52:21 INFO zookeeper. ZooKeeper: Client environment:java.version=1.7.0_79
15/09/08 00:52:21 INFO zookeeper. ZooKeeper: Client environment:java.vendor=Oracle Corporation
15/09/08 00:52:21 INFO zookeeper. ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-7-openjdk-amd64/jre
15/09/08 00:52:21 INFO zookeeper. ZooKeeper: Client environment:java.class.path=.:${CLASSPATH}:./**/
15/09/08 00:52:21 INFO zookeeper. ZooKeeper: Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni :/lib:/usr/l$
15/09/08 00:52:21 INFO zookeeper. ZooKeeper: Client environment:java.io.tmpdir=/tmp
15/09/08 00:52:21 INFO zookeeper. ZooKeeper: Client environment:java.compiler=<NA>
15/09/08 00:52:21 INFO zookeeper. ZooKeeper: Client environment:os.name=Linux
15/09/08 00:52:21 INFO zookeeper. ZooKeeper: Client environment:os.arch=amd64
15/09/08 00:52:21 INFO zookeeper. ZooKeeper: Client environment:os.version=3.13.0-62-generic
15/09/08 00:52:21 INFO zookeeper. ZooKeeper: Client environment:user.name=hduser
15/09/08 00:52:21 INFO zookeeper. ZooKeeper: Client environment:user.home=/home/hduser
15/09/08 00:52:21 INFO zookeeper. ZooKeeper: Client environment:user.dir=/app/hadoop/tmp/nm-local-dir/usercache/hduser/appcache/application_1441683854213_0001/container_1441683854213_0001_01_000003
15/09/08 00:52:21 INFO zookeeper. ZooKeeper: Initiating client connection, connectString=localhost:22181 sessionTimeout=60000 [email protected]
15/09/08 00:52:21 INFO zookeeper. ClientCnxn: Opening socket connection to server localhost/127.0.0.1:22181. Will not attempt to authenticate using SASL (unknown error)
15/09/08 00:52:21 INFO zookeeper. ClientCnxn: Socket connection established to localhost/127.0.0.1:22181, initiating session
15/09/08 00:52:21 INFO zookeeper. ClientCnxn: Session establishment complete on server localhost/127.0.0.1:22181, sessionid = 0x14fab0de0bb0002, negotiated timeout = 40000
15/09/08 00:52:21 INFO bsp. BspService: process: Asynchronous connection complete.
15/09/08 00:52:21 INFO netty. NettyServer: NettyServer: Using execution group with 8 threads for requestFrameDecoder.
15/09/08 00:52:21 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
15/09/08 00:52:21 INFO netty. NettyServer: start: Started server communication server: localhost/127.0.0.1:30001 with up to 16 threads on bind attempt 0 with sendBufferSize = 32768 receiveBufferSize = 524288
15/09/08 00:52:21 INFO netty. NettyClient: NettyClient: Using execution handler with 8 threads after request-encoder.
15/09/08 00:52:21 INFO graph. GraphTaskManager: setup: Registering health of this worker...
15/09/08 00:52:21 INFO yarn. GiraphYarnTask: [STATUS: task-1] WORKER_ONLY starting...
15/09/08 00:52:22 INFO bsp. BspService: getJobState: Job state already exists (/_hadoopBsp/giraph_yarn_application_1441683854213_0001/_masterJobState)
15/09/08 00:52:22 INFO bsp. BspService: getApplicationAttempt: Node /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_applicationAttemptsDir already exists!
15/09/08 00:52:22 INFO bsp. BspService: getApplicationAttempt: Node /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_applicationAttemptsDir already exists!
15/09/08 00:52:22 INFO worker. BspServiceWorker: registerHealth: Created my health node for attempt=0, superstep=-1 with /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_applicationAttemptsDir/0/_ superstepD$
15/09/08 00:52:22 INFO netty. NettyServer: start: Using Netty without authentication.
15/09/08 00:52:22 INFO bsp. BspService: process: partitionAssignmentsReadyChanged (partitions are assigned)
15/09/08 00:52:22 INFO worker. BspServiceWorker: startSuperstep: Master(hostname=localhost, MRtaskID=0, port=30000)
15/09/08 00:52:22 INFO worker. BspServiceWorker: startSuperstep: Ready for computation on superstep -1 since worker selection and vertex range assignments are done in /_hadoopBsp/giraph_yarn_application_1441683854$
15/09/08 00:52:22 INFO yarn. GiraphYarnTask: [STATUS: task-1] startSuperstep: WORKER_ONLY - Attempt=0, Superstep=-1
15/09/08 00:52:22 INFO netty. NettyClient: Using Netty without authentication.
15/09/08 00:52:22 INFO netty. NettyClient: Using Netty without authentication.
15/09/08 00:52:22 INFO netty. NettyClient: connectAllAddresses: Successfully added 2 connections, (2 total connected) 0 failed, 0 failures total.
15/09/08 00:52:22 INFO netty. NettyServer: start: Using Netty without authentication.
15/09/08 00:52:22 INFO handler. RequestDecoder: decode: Server window metrics MBytes/sec received = 0, MBytesReceived = 0.0001, ave received req MBytes = 0.0001, secs waited = 1.44168435E9
15/09/08 00:52:22 INFO worker. BspServiceWorker: loadInputSplits: Using 1 thread(s), originally 1 threads(s) for 1 total splits.
15/09/08 00:52:22 INFO worker. InputSplitsHandler: reserveInputSplit: Reserved input split path /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_vertexInputSplitDir/0, overall roughly 0.0% input splits rese$
15/09/08 00:52:22 INFO worker. InputSplitsCallable: getInputSplit: Reserved /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_vertexInputSplitDir/0 from ZooKeeper and got input split ' hdfs://hdnode01:54310/u$
15/09/08 00:52:22 INFO worker. InputSplitsCallable: loadFromInputSplit: Finished loading /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_vertexInputSplitDir/0 (v=6, e=10)
15/09/08 00:52:22 INFO worker. InputSplitsCallable: call: Loaded 1 input splits in 0.16241108 secs, (v=6, e=10) 36.94329 vertices/sec, 61.572155 edges/sec
15/09/08 00:52:22 ERROR utils. LogStacktraceCallable: Execution of callable failed
java.lang.IllegalStateException: next: IOException
at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:101)
at org.apache.giraph.partition.BasicPartition.addPartitionVertices(BasicPartition.java:99)
at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:115)
at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:466)
at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.flush(NettyWorkerClientRequestProcessor.java:412)
at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:241)
at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: ensureRemaining: Only 0 bytes remaining, trying to read 1
at org.apache.giraph.utils.UnsafeReads.ensureRemaining(UnsafeReads.java:77)
at org.apache.giraph.utils.UnsafeArrayReads.readByte(UnsafeArrayReads.java:123)
at org.apache.giraph.utils.UnsafeReads.readLine(UnsafeReads.java:100)
at pruebas. TextAndDoubleComplexWritable.readFields(TextAndDoubleComplexWritable.java:37)
at org.apache.giraph.utils.WritableUtils.reinitializeVertexFromDataInput(WritableUtils.java:540)
at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:98)
... 11 more
15/09/08 00:52:22 ERROR worker. BspServiceWorker: unregisterHealth: Got failure, unregistering health on /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_applicationAttemptsDir/0/_superstepDir/-1/_ workerHea$
15/09/08 00:52:22 ERROR yarn. GiraphYarnTask: GiraphYarnTask threw a top-level exception, failing task
java.lang.RuntimeException: run: Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$ [email protected]
at org.apache.giraph.yarn.GiraphYarnTask.run(GiraphYarnTask.java:104)
at org.apache.giraph.yarn.GiraphYarnTask.main(GiraphYarnTask.java:183)
Caused by: java.lang.IllegalStateException: waitFor: ExecutionException occurred while waiting for [email protected]0
at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:193)
at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:151)
at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:136)
at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:99)
at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:233)
at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:316)
at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:409)
at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:629)
at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:284)
at org.apache.giraph.yarn.GiraphYarnTask.run(GiraphYarnTask.java:92)
... 1 more
Caused by: java.util.concurrent.ExecutionException: java.lang.IllegalStateException: next: IOException
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:202)
at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:312)
at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:185)
... 10 more
Caused by: java.lang.IllegalStateException: next: IOException
at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:101)
at org.apache.giraph.partition.BasicPartition.addPartitionVertices(BasicPartition.java:99)
at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:115)
at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:466)
at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.flush(NettyWorkerClientRequestProcessor.java:412)
at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:241)
at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: ensureRemaining: Only 0 bytes remaining, trying to read 1
at org.apache.giraph.utils.UnsafeReads.ensureRemaining(UnsafeReads.java:77)
at org.apache.giraph.utils.UnsafeArrayReads.readByte(UnsafeArrayReads.java:123)
at org.apache.giraph.utils.UnsafeReads.readLine(UnsafeReads.java:100)
at pruebas. TextAndDoubleComplexWritable.readFields(TextAndDoubleComplexWritable.java:37)
at org.apache.giraph.utils.WritableUtils.reinitializeVertexFromDataInput(WritableUtils.java:540)
at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:98)
... 11 more
My input format:
package pruebas;
import org.apache.giraph.edge.Edge;
import org.apache.giraph.edge.EdgeFactory;
import org.apache.giraph.io.formats.AdjacencyListTextVertexInputFormat;
import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.TaskAttemptContext;
/**
* @author hduser
*
*/
public class IdTextWithComplexValueInputFormat
extends
AdjacencyListTextVertexInputFormat<Text, TextAndDoubleComplexWritable, DoubleWritable> {
@Override
public AdjacencyListTextVertexReader createVertexReader(InputSplit split,
TaskAttemptContext context) {
return new TextComplexValueDoubleAdjacencyListVertexReader();
}
protected class TextComplexValueDoubleAdjacencyListVertexReader extends
AdjacencyListTextVertexReader {
/**
* Constructor with
* {@link AdjacencyListTextVertexInputFormat.LineSanitizer}.
*
* @param lineSanitizer
* the sanitizer to use for reading
*/
public TextComplexValueDoubleAdjacencyListVertexReader() {
super();
}
@Override
public Text decodeId(String s) {
return new Text(s);
}
@Override
public TextAndDoubleComplexWritable decodeValue(String s) {
TextAndDoubleComplexWritable valorComplejo = new TextAndDoubleComplexWritable();
valorComplejo.setVertexData(Double.valueOf(s));
valorComplejo.setIds_vertices_anteriores("");
return valorComplejo;
}
@Override
public Edge<Text, DoubleWritable> decodeEdge(String s1, String s2) {
return EdgeFactory.create(new Text(s1),
new DoubleWritable(Double.valueOf(s2)));
}
}
}
TextAndDoubleComplexWritable:
package pruebas;
import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import org.apache.hadoop.io.Writable;
public class TextAndDoubleComplexWritable implements Writable {
private String idsVerticesAnteriores;
private double vertexData;
public TextAndDoubleComplexWritable() {
super();
this.idsVerticesAnteriores = "";
}
public TextAndDoubleComplexWritable(double vertexData) {
super();
this.vertexData = vertexData;
}
public TextAndDoubleComplexWritable(String ids_vertices_anteriores,
double vertexData) {
super();
this.idsVerticesAnteriores = ids_vertices_anteriores;
this.vertexData = vertexData;
}
public void write(DataOutput out) throws IOException {
out.writeUTF(idsVerticesAnteriores);
}
public void readFields(DataInput in) throws IOException {
idsVerticesAnteriores = in.readLine();
}
public String getIds_vertices_anteriores() {
return idsVerticesAnteriores;
}
public void setIds_vertices_anteriores(String ids_vertices_anteriores) {
this.idsVerticesAnteriores = ids_vertices_anteriores;
}
public double getVertexData() {
return vertexData;
}
public void setVertexData(double vertexData) {
this.vertexData = vertexData;
}
}
My input file:
Portada 0.0 Sugerencias 1.0
Sugerencias 3.0 Portada 1.0
Then I execute it with this command :
$HADOOP_HOME/bin/yarn jar $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.1.0-for-hadoop-2.4.0-jar-with-dependencies.jar org.apache.giraph.GiraphRunner lectura_de_grafo. BusquedaDeCaminosNavegacionalesWikiquote -vif pruebas. IdTextWithComplexValueInputFormat -vip /user/hduser/input/wiki-graph-chiquito.txt -op /user/hduser/output/caminosNavegacionales -w 2 -yh 250
Any help would be appreciated!
Update:
My input file is wrong. Giraph (or my example) didn’t handle it very well
Egress to an unlisted vertex.
But the problem still arises. I updated the file data of the original issue.
Update 2:
OutputFormat is not used and calculation algorithms are never executed. I removed both to help clarify the issue.
Update 3, November 19, 2015:
The problem is not in the input format, which works well and reads the data in its entirety.
The problem is in the TextAndDoubleComplexWritable
class, which I added to my original problem to better explain the final solution (I also added an answer).
Solution
This is the root cause of the exception org.apache.giraph.utils.UnsafeReads.ensureRemaining.
Note that this is called by the giraph utility.
An exception means that the reader insists that it needs more input from the input stream, but the input stream doesn’t have as many inputs left (i.e. it reaches the EOF).