I have one small table (of size 100Mb) in bigtable with 10 instances. When i am trying to scan/get a row once every 1 minute, the latency of the call is more than 300ms. If i hit is with more frequent calls like one every second the latency is 50-60ms. I am not sure how can i improve the performance with low frequency calls. is this expected behavior. or am i doing anything wrong.
Here is my test code. I created a single executor for two hbase client connections to big table. but the low frequency connection response is way slower than the connection that make more frequent calls.
Any suggestions?
package com.bids;
import java.io.IOException;
import java.util.ArrayList;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.CellUtil;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.ResultScanner;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.filter.FilterList;
import org.apache.hadoop.hbase.util.Bytes;
import org.fusesource.jansi.AnsiConsole;
public class BTConnectTest {
public static void main(String[] args) throws IOException, InterruptedException {
Configuration hBaseConfig = HBaseConfiguration.create();
hBaseConfig.set("google.bigtable.project.id", "xxxxxxx");
hBaseConfig.set("google.bigtable.cluster.name", "hbase-test1");
hBaseConfig.set("google.bigtable.zone.name", "us-central1-b");
hBaseConfig.set("hbase.client.connection.impl", "com.google.cloud.bigtable.hbase1_1.BigtableConnection");
ExecutorService executor = Executors.newSingleThreadExecutor();
final Connection bigTableConnection1 = ConnectionFactory.createConnection(hBaseConfig, executor);
final Connection bigTableConnection2 = ConnectionFactory.createConnection(hBaseConfig, executor);
Thread t = new Thread(new Runnable() {
@Override
public void run() {
while (true) {
try {
Thread.sleep(1000);
} catch (InterruptedException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
long before = System.nanoTime();
try {
makeACall2Bigtable(bigTableConnection2);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
// bigTableConnection.close();
long after = System.nanoTime();
long diff = after - before;
System.out.println("\t\t\t\t\t\t connection: " + 1 + " diff: " + diff / (1000 * 1000));
}
}
});
t.start();
long sum = 0;
int n = 0;
while (true) {
if (n > 60) {
Thread.sleep(60000);
}
long before = System.nanoTime();
Connection bigTableConnection = bigTableConnection1;
int label = -1;
makeACall2Bigtable(bigTableConnection);
long after = System.nanoTime();
long diff = after - before;
n = n + 1;
sum += diff;
long avg = sum / (n * 1000 * 1000);
AnsiConsole a = new AnsiConsole();
System.out.println("connection: " + 0 + " diff: " + diff / (1000 * 1000) + " avg: " + avg);
}
// bigTableConnection.close();
}
private static void makeACall2Bigtable(Connection bigTableConnection) throws IOException {
Table table = bigTableConnection.getTable(TableName.valueOf("customer"));
Scan scan = new Scan();
scan.setStartRow(Bytes.toBytes("101"));
scan.setStopRow(Bytes.toBytes("102"));
List<String> cols = new ArrayList<String>(3);
cols.add("name");
cols.add("age");
cols.add("weight");
String keyName = "id";
final String DEFAULT_COLFAM = "z";
for (String col : cols) {
scan.addColumn(Bytes.toBytes(DEFAULT_COLFAM), Bytes.toBytes(col));
}
ResultScanner resultScanner = table.getScanner(scan);
for (Result result : resultScanner) {
Map<String, String> columnValueMap = new LinkedHashMap<String, String>();
for (String col : cols) {
if (result.containsColumn(Bytes.toBytes(DEFAULT_COLFAM), Bytes.toBytes(col))) {
columnValueMap.put(col, new String(CellUtil.cloneValue(
result.getColumnLatestCell(Bytes.toBytes(DEFAULT_COLFAM), Bytes.toBytes(col)))));
} else {
if (cols.contains(keyName)) {
columnValueMap.put(col, null);
}
}
}
}
resultScanner.close();
table.close();
}
}
I hope that helps.
If you really are going to do low frequency requests in production, you might wish to run a background thread that makes a random request to your table every few seconds.
Bigtable is really optimized for a large amount of data with frequent access. The first request in a while may call for the data to be read in again. Periodic requests will keep it live.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With