Phoenix does not display negative integer values correctly
I’m creating an HBASE table with a value of -17678. But when I got it back from pheonix, it gave me different positive values. RowKey is a composite rowkey, and there is no problem with rowkey.
Hbase insert:
public class test
{
public static void main(String args[])
{
Configuration config = HBaseConfiguration.create();
Connection connection = ConnectionFactory.createConnection(config);
Table table = connection.getTable(TableName.valueOf("TEST"));
Integer i=-17678;
try
{
Put p = new Put(Bytes.toBytes("rowkey"));
p.addColumn(Bytes.toBytes("test"),Bytes.toBytes("test"),Bytes.toBytes(i));
table.put(p);
}
finally
{
table.close();
connection.close();
}
}
}
Phoenix Retrieval:
Select CAST(“Value”AS INTEGER) from the test;
+------------------------------------------+
| TO_INTEGER(test." Value") |
+------------------------------------------+
| 2147465970 |
+------------------------------------------+
Is there something wrong here? Or the phoenix problem?
Solution
http://phoenix.apache.org/language/datatypes.html
The binary representation is a 4 byte integer with the sign bit flipped (so that negative values sorts before positive values).
So convert from HBase serialization format to Phoenix format:
(-17678)10 = (11111111111111111011101011110010)2
=> (01111111111111111011101011110010)2 = (2147465970)10
Therefore, the output is as expected. When inserting data using HBase, you need to be mindful of the binary representation.
Only the CHAR and UNSIGNED_* data types can read HBase toByte directly to Phoenix. You must be suitable for local serialization for other data types. IE。 Set i = 2147465970
when you want to insert -17678
.
I recommend using Phoenix to insert data. If you’re concerned about making your application dependent on dependencies, Phoenix provides a “thin” JDBC driver (4MB instead of 86MB).
https://phoenix.apache.org/server.html
If you absolutely must use HBase, you can use bitwise XOR to serialize signed numbers.
For integers, you may want to use a bitmask to XOR your i
to flip the sign bits.
The bitmask applied to a 4-byte integer is:
(10000000000000000000000000000000)2 = (-2147483648)10
From http://ideone.com/anhgs5 we get 2147465970
. If you insert it using HBase, when you read with Phoenix, you will read -17678).
You need to use different bitmasks for Bigint (which shares a bitmask with a datetime type), Smallint, Float, and Double.