diff --git a/.gitignore b/.gitignore index 8782d86f6..1c7a7e78f 100644 --- a/.gitignore +++ b/.gitignore @@ -6,4 +6,8 @@ __pycache__ .settings .classpath .DS_Store - +.diff.txt +.new-list.txt +.org-list.txt +SimpleBenchmarkApp/src/main/java/com/google/cloud/App.java +.flattened-pom.xml \ No newline at end of file diff --git a/.kokoro/dependencies.sh b/.kokoro/dependencies.sh index 586b78bb9..545820622 100755 --- a/.kokoro/dependencies.sh +++ b/.kokoro/dependencies.sh @@ -67,13 +67,13 @@ function completenessCheck() { msg "Generating dependency list using original pom..." # Excluding commons-codec,commons-logging from the comparison as a temp fix # Explanation and issue filed in maven-dependency-plugin: https://issues.apache.org/jira/browse/MDEP-737 - mvn dependency:list -f pom.xml -DexcludeArtifactIds=commons-codec,commons-logging -DincludeScope=runtime -Dsort=true | grep '\[INFO] .*:.*:.*:.*:.*' | sed -e s/\\s--\\smodule.*// >.org-list.txt + mvn dependency:list -f pom.xml -DexcludeArtifactIds=commons-codec,commons-logging,grpc-googleapis -DincludeScope=runtime -Dsort=true | grep '\[INFO] .*:.*:.*:.*:.*' | sed -e s/\\s--\\smodule.*// >.org-list.txt # Output dep list generated using the flattened pom (only 'compile' and 'runtime' scopes) msg "Generating dependency list using flattened pom..." # Excluding commons-codec,commons-logging from the comparison as a temp fix # Explanation and issue filed in maven-dependency-plugin: https://issues.apache.org/jira/browse/MDEP-737 - mvn dependency:list -f .flattened-pom.xml -DexcludeArtifactIds=commons-codec,commons-logging -DincludeScope=runtime -Dsort=true | grep '\[INFO] .*:.*:.*:.*:.*' >.new-list.txt + mvn dependency:list -f .flattened-pom.xml -DexcludeArtifactIds=commons-codec,commons-logging,grpc-googleapis -DincludeScope=runtime -Dsort=true | grep '\[INFO] .*:.*:.*:.*:.*' >.new-list.txt # Compare two dependency lists msg "Comparing dependency lists..." diff --git a/benchmark/README.md b/benchmark/README.md index 41e9c2fda..d1a1ae157 100644 --- a/benchmark/README.md +++ b/benchmark/README.md @@ -19,3 +19,10 @@ To run a benchmark jar, run the following command cd benchmark java -jar target/benchmark.jar ``` + +To run ConnImplBenchmark, run the following command +``` +# Run from benchmark directory + cd benchmark + java -jar target/benchmark.jar com.google.cloud.bigquery.ConnImplBenchmark +``` diff --git a/benchmark/src/main/java/com.google.cloud.bigquery/ConnImplBenchmark.java b/benchmark/src/main/java/com.google.cloud.bigquery/ConnImplBenchmark.java new file mode 100644 index 000000000..36c27eb6a --- /dev/null +++ b/benchmark/src/main/java/com.google.cloud.bigquery/ConnImplBenchmark.java @@ -0,0 +1,186 @@ +/* + * Copyright 2022 Google LLC + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package com.google.cloud.bigquery; + +import java.io.IOException; +import java.sql.ResultSet; +import java.sql.SQLException; +import java.util.concurrent.TimeUnit; +import java.util.logging.Level; +import org.openjdk.jmh.annotations.Benchmark; +import org.openjdk.jmh.annotations.BenchmarkMode; +import org.openjdk.jmh.annotations.Fork; +import org.openjdk.jmh.annotations.Measurement; +import org.openjdk.jmh.annotations.Mode; +import org.openjdk.jmh.annotations.OutputTimeUnit; +import org.openjdk.jmh.annotations.Param; +import org.openjdk.jmh.annotations.Scope; +import org.openjdk.jmh.annotations.Setup; +import org.openjdk.jmh.annotations.State; +import org.openjdk.jmh.annotations.Warmup; +import org.openjdk.jmh.infra.Blackhole; +import org.openjdk.jmh.runner.Runner; +import org.openjdk.jmh.runner.options.Options; +import org.openjdk.jmh.runner.options.OptionsBuilder; + +@Fork(value = 1) +@BenchmarkMode(Mode.AverageTime) +@Warmup(iterations = 1) +@Measurement(iterations = 3) +@State(Scope.Benchmark) +@OutputTimeUnit(TimeUnit.MILLISECONDS) +public class ConnImplBenchmark { + @Param({"500000", "1000000", "10000000", "100000000"}) // 500K, 1M, 10M, and 100M + public int rowLimit; + + private ConnectionSettings connectionSettingsReadAPIEnabled, connectionSettingsReadAPIDisabled; + private long numBuffRows = 100000L; + private final String DATASET = "new_york_taxi_trips"; + private final String QUERY = + "SELECT * FROM bigquery-public-data.new_york_taxi_trips.tlc_yellow_trips_2017 LIMIT %s"; + public static final long NUM_PAGE_ROW_CNT_RATIO = + 10; // ratio of [records in the current page :: total rows] to be met to use read API + public static final long NUM_MIN_RESULT_SIZE = + 200000; // min number of records to use to ReadAPI with + + @Setup + public void setUp() throws IOException { + java.util.logging.Logger.getGlobal().setLevel(Level.ALL); + ReadClientConnectionConfiguration clientConnectionConfiguration; + + clientConnectionConfiguration = + ReadClientConnectionConfiguration.newBuilder() + .setTotalToPageRowCountRatio(NUM_PAGE_ROW_CNT_RATIO) + .setMinResultSize(NUM_MIN_RESULT_SIZE) + .setBufferSize(numBuffRows) + .build(); + + connectionSettingsReadAPIEnabled = + ConnectionSettings.newBuilder() + .setDefaultDataset(DatasetId.of(DATASET)) + .setNumBufferedRows(numBuffRows) // page size + .setPriority( + QueryJobConfiguration.Priority + .INTERACTIVE) // DEFAULT VALUE - so that isFastQuerySupported returns false + .setReadClientConnectionConfiguration(clientConnectionConfiguration) + .setUseReadAPI(true) // enable read api + .build(); + connectionSettingsReadAPIDisabled = + ConnectionSettings.newBuilder() + .setDefaultDataset(DatasetId.of(DATASET)) + .setNumBufferedRows(numBuffRows) // page size + .setPriority( + QueryJobConfiguration.Priority + .INTERACTIVE) // so that isFastQuerySupported returns false + .setReadClientConnectionConfiguration(clientConnectionConfiguration) + .setUseReadAPI(false) // disable read api + .build(); + } + + @Benchmark + public void iterateRecordsUsingReadAPI(Blackhole blackhole) + throws InterruptedException, BigQuerySQLException { + Connection connectionReadAPIEnabled = + BigQueryOptions.getDefaultInstance() + .getService() + .createConnection(connectionSettingsReadAPIEnabled); + String selectQuery = String.format(QUERY, rowLimit); + long hash = 0L; + try { + BigQueryResultSet bigQueryResultSet = connectionReadAPIEnabled.executeSelect(selectQuery); + hash = getResultHash(bigQueryResultSet); + } catch (Exception e) { + e.printStackTrace(); + } finally { + connectionReadAPIEnabled.close(); // IMP to kill the bg workers + } + blackhole.consume(hash); + } + + @Benchmark + public void iterateRecordsWithoutUsingReadAPI(Blackhole blackhole) + throws InterruptedException, BigQuerySQLException { + Connection connectionReadAPIDisabled = + BigQueryOptions.getDefaultInstance() + .getService() + .createConnection(connectionSettingsReadAPIDisabled); + String selectQuery = String.format(QUERY, rowLimit); + long hash = 0L; + try { + BigQueryResultSet bigQueryResultSet = connectionReadAPIDisabled.executeSelect(selectQuery); + hash = getResultHash(bigQueryResultSet); + } catch (Exception e) { + e.printStackTrace(); + } finally { + connectionReadAPIDisabled.close(); // IMP to kill the bg workers + } + blackhole.consume(hash); + } + + // Hashes all the 20 columns of all the rows + private long getResultHash(BigQueryResultSet bigQueryResultSet) throws SQLException { + ResultSet rs = bigQueryResultSet.getResultSet(); + long hash = 0L; + int cnt = 0; + System.out.print("\n Running"); + while (rs.next()) { + hash += rs.getString("vendor_id") == null ? 0 : rs.getString("vendor_id").hashCode(); + hash += + rs.getString("pickup_datetime") == null ? 0 : rs.getString("pickup_datetime").hashCode(); + hash += + rs.getString("dropoff_datetime") == null + ? 0 + : rs.getString("dropoff_datetime").hashCode(); + hash += rs.getLong("passenger_count"); + hash += rs.getDouble("trip_distance"); + hash += rs.getDouble("pickup_longitude"); + hash += rs.getDouble("pickup_latitude"); + hash += rs.getString("rate_code") == null ? 0 : rs.getString("rate_code").hashCode(); + hash += + rs.getString("store_and_fwd_flag") == null + ? 0 + : rs.getString("store_and_fwd_flag").hashCode(); + hash += rs.getDouble("dropoff_longitude"); + hash += rs.getDouble("dropoff_latitude"); + hash += rs.getString("payment_type") == null ? 0 : rs.getString("payment_type").hashCode(); + hash += rs.getDouble("fare_amount"); + hash += rs.getDouble("extra"); + hash += rs.getDouble("mta_tax"); + hash += rs.getDouble("tip_amount"); + hash += rs.getDouble("tolls_amount"); + hash += rs.getDouble("imp_surcharge"); + hash += rs.getDouble("total_amount"); + hash += + rs.getString("pickup_location_id") == null + ? 0 + : rs.getString("pickup_location_id").hashCode(); + hash += + rs.getString("dropoff_location_id") == null + ? 0 + : rs.getString("dropoff_location_id").hashCode(); + if (++cnt % 100000 == 0) { // just to indicate the progress while long running benchmarks + System.out.print("."); + } + } + return hash; + } + + public static void main(String[] args) throws Exception { + Options opt = new OptionsBuilder().include(ConnImplBenchmark.class.getSimpleName()).build(); + new Runner(opt).run(); + } +} diff --git a/google-cloud-bigquery/clirr-ignored-differences.xml b/google-cloud-bigquery/clirr-ignored-differences.xml new file mode 100644 index 000000000..1dccf5a66 --- /dev/null +++ b/google-cloud-bigquery/clirr-ignored-differences.xml @@ -0,0 +1,35 @@ + + + + + + 7012 + com/google/cloud/bigquery/BigQuery + com.google.cloud.bigquery.Connection createConnection(com.google.cloud.bigquery.ConnectionSettings) + + + 7012 + com/google/cloud/bigquery/BigQuery + com.google.cloud.bigquery.Connection createConnection() + + + 7012 + com/google/cloud/bigquery/spi/v2/BigQueryRpc + com.google.api.services.bigquery.model.Job createJobForQuery(com.google.api.services.bigquery.model.Job) + + + 7012 + com/google/cloud/bigquery/spi/v2/BigQueryRpc + com.google.api.services.bigquery.model.Job getQueryJob(java.lang.String, java.lang.String, java.lang.String) + + + 7012 + com/google/cloud/bigquery/spi/v2/BigQueryRpc + com.google.api.services.bigquery.model.GetQueryResultsResponse getQueryResultsWithRowLimit(java.lang.String, java.lang.String, java.lang.String, java.lang.Integer) + + + 7012 + com/google/cloud/bigquery/spi/v2/BigQueryRpc + com.google.api.services.bigquery.model.TableDataList listTableDataWithRowLimit(java.lang.String, java.lang.String, java.lang.String, java.lang.Integer, java.lang.String) + + \ No newline at end of file diff --git a/google-cloud-bigquery/pom.xml b/google-cloud-bigquery/pom.xml index 93d574784..c3abfe1cf 100644 --- a/google-cloud-bigquery/pom.xml +++ b/google-cloud-bigquery/pom.xml @@ -46,6 +46,10 @@ org.checkerframework checker-compat-qual + + org.checkerframework + checker-qual + com.google.auth google-auth-library-oauth2-http @@ -97,6 +101,21 @@ + + com.google.cloud + google-cloud-datacatalog + test + + + com.google.api.grpc + proto-google-cloud-datacatalog-v1 + test + + + com.google.cloud + google-cloud-storage + test + junit junit @@ -105,6 +124,7 @@ com.google.truth truth + test org.mockito diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/AbstractJdbcResultSet.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/AbstractJdbcResultSet.java new file mode 100644 index 000000000..5b8246925 --- /dev/null +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/AbstractJdbcResultSet.java @@ -0,0 +1,910 @@ +/* + * Copyright 2021 Google LLC + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package com.google.cloud.bigquery; + +import java.io.InputStream; +import java.io.Reader; +import java.math.BigDecimal; +import java.net.URL; +import java.sql.*; +import java.util.Calendar; +import java.util.Map; + +abstract class AbstractJdbcResultSet implements ResultSet { + + @Override + public BigDecimal getBigDecimal(String columnLabel, int scale) throws SQLException { + // TODO: Implement the logic + throw new RuntimeException("Not implemented"); + } + + @Override + public void close() throws SQLException { + // TODO: Implement the logic + throw new RuntimeException("Not implemented"); + } + + @Override + public boolean wasNull() throws SQLException { + // TODO: Implement the logic + throw new RuntimeException("Not implemented"); + } + + @Override + public byte getByte(int columnIndex) throws SQLException { + // TODO: Implement the logic + throw new RuntimeException("Not implemented"); + } + + @Override + public byte getByte(String column) throws SQLException { + // TODO: Implement the logic + throw new RuntimeException("Not implemented"); + } + + @Override + public short getShort(int columnIndex) throws SQLException { + // TODO: Implement the logic + throw new RuntimeException("Not implemented"); + } + + @Override + public float getFloat(int columnIndex) throws SQLException { + // TODO: Implement the logic + throw new RuntimeException("Not implemented"); + } + + @Override + public BigDecimal getBigDecimal(int columnIndex, int scale) throws SQLException { + // TODO: Implement the logic + throw new RuntimeException("Not implemented"); + } + + @Override + public InputStream getAsciiStream(int columnIndex) throws SQLException { + // TODO: Implement the logic + throw new RuntimeException("Not implemented"); + } + + @Override + public InputStream getUnicodeStream(int columnIndex) throws SQLException { + // TODO: Implement the logic + throw new RuntimeException("Not implemented"); + } + + @Override + public InputStream getBinaryStream(int columnIndex) throws SQLException { + // TODO: Implement the logic + throw new RuntimeException("Not implemented"); + } + + @Override + public int findColumn(String columnLabel) throws SQLException { + // TODO: Implement the logic + throw new RuntimeException("Not implemented"); + } + + @Override + public Reader getCharacterStream(int columnIndex) throws SQLException { + // TODO: Implement the logic + throw new RuntimeException("Not implemented"); + } + + @Override + public Reader getCharacterStream(String columnLabel) throws SQLException { + // TODO: Implement the logic + throw new RuntimeException("Not implemented"); + } + + @Override + public SQLWarning getWarnings() throws SQLException { + return null; + } + + @Override + public void clearWarnings() throws SQLException {} + + @Override + public String getCursorName() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public boolean isLast() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void beforeFirst() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void afterLast() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public boolean first() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public boolean last() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public boolean absolute(int row) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public boolean relative(int rows) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public boolean previous() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void setFetchDirection(int direction) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public int getFetchDirection() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void setFetchSize(int rows) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public int getFetchSize() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public int getType() throws SQLException { + return TYPE_FORWARD_ONLY; + } + + @Override + public int getConcurrency() throws SQLException { + return CONCUR_READ_ONLY; + } + + @Override + public boolean rowUpdated() throws SQLException { + return false; + } + + @Override + public boolean rowInserted() throws SQLException { + return false; + } + + @Override + public boolean rowDeleted() throws SQLException { + return false; + } + + @Override + public void updateNull(int columnIndex) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateBoolean(int columnIndex, boolean x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateByte(int columnIndex, byte x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateShort(int columnIndex, short x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateInt(int columnIndex, int x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateLong(int columnIndex, long x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateFloat(int columnIndex, float x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateDouble(int columnIndex, double x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateBigDecimal(int columnIndex, BigDecimal x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateString(int columnIndex, String x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateBytes(int columnIndex, byte[] x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateDate(int columnIndex, Date x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateTime(int columnIndex, Time x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateTimestamp(int columnIndex, Timestamp x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateAsciiStream(int columnIndex, InputStream x, int length) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateBinaryStream(int columnIndex, InputStream x, int length) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateCharacterStream(int columnIndex, Reader x, int length) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateObject(int columnIndex, Object x, int scaleOrLength) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateObject(int columnIndex, Object x, SQLType type, int scaleOrLength) + throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateObject(int columnIndex, Object x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateObject(int columnIndex, Object x, SQLType type) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateNull(String columnLabel) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateBoolean(String columnLabel, boolean x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateByte(String columnLabel, byte x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateShort(String columnLabel, short x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateInt(String columnLabel, int x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateLong(String columnLabel, long x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateFloat(String columnLabel, float x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateDouble(String columnLabel, double x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateBigDecimal(String columnLabel, BigDecimal x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateString(String columnLabel, String x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateBytes(String columnLabel, byte[] x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateDate(String columnLabel, Date x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateTime(String columnLabel, Time x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateTimestamp(String columnLabel, Timestamp x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateAsciiStream(String columnLabel, InputStream x, int length) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateBinaryStream(String columnLabel, InputStream x, int length) + throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateCharacterStream(String columnLabel, Reader reader, int length) + throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateObject(String columnLabel, Object x, int scaleOrLength) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateObject(String columnLabel, Object x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateObject(String columnLabel, Object x, SQLType type) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateObject(String columnLabel, Object x, SQLType type, int scaleOrLength) + throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void insertRow() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateRow() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void deleteRow() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void refreshRow() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void cancelRowUpdates() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void moveToInsertRow() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void moveToCurrentRow() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public Ref getRef(int columnIndex) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public Ref getRef(String columnLabel) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateRef(int columnIndex, Ref x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateRef(String columnLabel, Ref x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateBlob(int columnIndex, Blob x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateBlob(String columnLabel, Blob x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateClob(int columnIndex, Clob x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateClob(String columnLabel, Clob x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateArray(int columnIndex, Array x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateArray(String columnLabel, Array x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public RowId getRowId(int columnIndex) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public RowId getRowId(String columnLabel) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateRowId(int columnIndex, RowId x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateRowId(String columnLabel, RowId x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateNString(int columnIndex, String nString) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateNString(String columnLabel, String nString) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateNClob(int columnIndex, NClob nClob) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateNClob(String columnLabel, NClob nClob) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public SQLXML getSQLXML(int columnIndex) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public SQLXML getSQLXML(String columnLabel) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateSQLXML(int columnIndex, SQLXML xmlObject) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateSQLXML(String columnLabel, SQLXML xmlObject) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateNCharacterStream(int columnIndex, Reader x, long length) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateNCharacterStream(String columnLabel, Reader reader, long length) + throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateAsciiStream(int columnIndex, InputStream x, long length) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateBinaryStream(int columnIndex, InputStream x, long length) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateCharacterStream(int columnIndex, Reader x, long length) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateAsciiStream(String columnLabel, InputStream x, long length) + throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateBinaryStream(String columnLabel, InputStream x, long length) + throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateCharacterStream(String columnLabel, Reader reader, long length) + throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateBlob(int columnIndex, InputStream inputStream, long length) + throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateBlob(String columnLabel, InputStream inputStream, long length) + throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateClob(int columnIndex, Reader reader, long length) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateClob(String columnLabel, Reader reader, long length) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateNClob(int columnIndex, Reader reader, long length) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateNClob(String columnLabel, Reader reader, long length) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateNCharacterStream(int columnIndex, Reader x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateNCharacterStream(String columnLabel, Reader reader) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateAsciiStream(int columnIndex, InputStream x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateBinaryStream(int columnIndex, InputStream x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateCharacterStream(int columnIndex, Reader x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateAsciiStream(String columnLabel, InputStream x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateBinaryStream(String columnLabel, InputStream x) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateCharacterStream(String columnLabel, Reader reader) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateBlob(int columnIndex, InputStream inputStream) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateBlob(String columnLabel, InputStream inputStream) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateClob(int columnIndex, Reader reader) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateClob(String columnLabel, Reader reader) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateNClob(int columnIndex, Reader reader) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public void updateNClob(String columnLabel, Reader reader) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public boolean isBeforeFirst() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public boolean isAfterLast() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public boolean isFirst() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public int getRow() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public Statement getStatement() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public Object getObject(int columnIndex, Map> map) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public Blob getBlob(int columnIndex) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public Clob getClob(int columnIndex) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public Array getArray(int columnIndex) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public float getFloat(String columnLabel) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public Object getObject(String columnLabel, Map> map) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public Blob getBlob(String columnLabel) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public Clob getClob(String columnLabel) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public Array getArray(String columnLabel) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public Date getDate(int columnIndex, Calendar cal) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public Date getDate(String columnLabel, Calendar cal) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public Time getTime(int columnIndex, Calendar cal) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public Time getTime(String columnLabel, Calendar cal) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public Timestamp getTimestamp(int columnIndex, Calendar cal) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public Timestamp getTimestamp(String columnLabel, Calendar cal) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public URL getURL(int columnIndex) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public URL getURL(String columnLabel) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public int getHoldability() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public boolean isClosed() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public NClob getNClob(int columnIndex) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public NClob getNClob(String columnLabel) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public String getNString(int columnIndex) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public String getNString(String columnLabel) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public Reader getNCharacterStream(int columnIndex) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public Reader getNCharacterStream(String columnLabel) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public T getObject(int columnIndex, Class type) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public T getObject(String columnLabel, Class type) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public short getShort(String columnLabel) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public InputStream getAsciiStream(String columnLabel) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public InputStream getUnicodeStream(String columnLabel) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public InputStream getBinaryStream(String columnLabel) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public ResultSetMetaData getMetaData() throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public T unwrap(Class iface) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } + + @Override + public boolean isWrapperFor(Class iface) throws SQLException { + throw new SQLFeatureNotSupportedException(); + } +} diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQuery.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQuery.java index 4e88f000f..b574df32d 100644 --- a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQuery.java +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQuery.java @@ -18,6 +18,7 @@ import static com.google.common.base.Preconditions.checkArgument; +import com.google.api.core.BetaApi; import com.google.api.core.InternalApi; import com.google.api.gax.paging.Page; import com.google.cloud.FieldSelector; @@ -32,6 +33,7 @@ import java.io.Serializable; import java.util.ArrayList; import java.util.List; +import org.checkerframework.checker.nullness.qual.NonNull; /** * An interface for Google Cloud BigQuery. @@ -760,6 +762,53 @@ public int hashCode() { */ Job create(JobInfo jobInfo, JobOption... options); + /** + * Creates a new BigQuery query connection used for executing queries (not the same as BigQuery + * connection properties). It uses the BigQuery Storage Read API for high throughput queries by + * default. + * + *

Example of creating a query connection. + * + *

+   * {
+   *   @code
+   *       ConnectionSettings connectionSettings =
+   *         ConnectionSettings.newBuilder()
+   *             .setRequestTimeout(10L)
+   *             .setMaxResults(100L)
+   *             .setUseQueryCache(true)
+   *             .build();
+   *       Connection connection = bigquery.createConnection(connectionSettings);
+   * }
+   * 
+ * + * @throws BigQueryException upon failure + * @param connectionSettings + */ + @BetaApi + Connection createConnection(@NonNull ConnectionSettings connectionSettings); + + /** + * Creates a new BigQuery query connection used for executing queries (not the same as BigQuery + * connection properties). It uses the BigQuery Storage Read API for high throughput queries by + * default. This overloaded method creates a Connection with default ConnectionSettings for query + * execution where default values are set for numBufferedRows (20000), useReadApi (true), + * useLegacySql (false). + * + *

Example of creating a query connection. + * + *

+   * {
+   *   @code
+   *       Connection connection = bigquery.createConnection();
+   * }
+   * 
+ * + * @throws BigQueryException upon failure + */ + @BetaApi + Connection createConnection(); + /** * Returns the requested dataset or {@code null} if not found. * diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryDryRunResult.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryDryRunResult.java new file mode 100644 index 000000000..0494aa1a9 --- /dev/null +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryDryRunResult.java @@ -0,0 +1,39 @@ +/* + * Copyright 2021 Google LLC + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package com.google.cloud.bigquery; + +import com.google.api.core.BetaApi; +import java.util.List; + +public interface BigQueryDryRunResult { + + /** Returns the schema of the results. Null if the schema is not supplied. */ + @BetaApi + Schema getSchema() throws BigQuerySQLException; + + /** + * Returns query parameters for standard SQL queries by extracting undeclare query parameters from + * the dry run job. See more information: + * https://developers.google.com/resources/api-libraries/documentation/bigquery/v2/java/latest/com/google/api/services/bigquery/model/JobStatistics2.html#getUndeclaredQueryParameters-- + */ + @BetaApi + List getQueryParameters() throws BigQuerySQLException; + + /** Returns some processing statistics */ + @BetaApi + BigQueryResultStats getStatistics() throws BigQuerySQLException; +} diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryDryRunResultImpl.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryDryRunResultImpl.java new file mode 100644 index 000000000..fabb2f2fc --- /dev/null +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryDryRunResultImpl.java @@ -0,0 +1,49 @@ +/* + * Copyright 2021 Google LLC + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package com.google.cloud.bigquery; + +import java.util.List; + +public class BigQueryDryRunResultImpl implements BigQueryDryRunResult { + private Schema schema; + private List queryParameters; + private BigQueryResultStats stats; + + BigQueryDryRunResultImpl( + Schema schema, + List queryParameters, + BigQueryResultStats stats) { // Package-Private access + this.schema = schema; + this.queryParameters = queryParameters; + this.stats = stats; + } + + @Override + public Schema getSchema() throws BigQuerySQLException { + return schema; + } + + @Override + public List getQueryParameters() throws BigQuerySQLException { + return queryParameters; + } + + @Override + public BigQueryResultStats getStatistics() throws BigQuerySQLException { + return stats; + } +} diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryException.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryException.java index 06cbf344c..c42ff6327 100644 --- a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryException.java +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryException.java @@ -134,4 +134,13 @@ static BaseServiceException translateAndThrow(ExecutionException ex) { BaseServiceException.translate(ex); throw new BigQueryException(UNKNOWN_CODE, ex.getMessage(), ex.getCause()); } + + static BaseServiceException translateAndThrow(Exception ex) { + throw new BigQueryException(UNKNOWN_CODE, ex.getMessage(), ex.getCause()); + } + + static BaseServiceException translateAndThrowBigQuerySQLException(BigQueryException e) + throws BigQuerySQLException { + throw new BigQuerySQLException(e.getMessage(), e, e.getErrors()); + } } diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryImpl.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryImpl.java index b2e939df0..3cfbfd652 100644 --- a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryImpl.java +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryImpl.java @@ -21,6 +21,7 @@ import static com.google.common.base.Preconditions.checkArgument; import static java.net.HttpURLConnection.HTTP_NOT_FOUND; +import com.google.api.core.BetaApi; import com.google.api.core.InternalApi; import com.google.api.gax.paging.Page; import com.google.api.services.bigquery.model.ErrorProto; @@ -54,6 +55,7 @@ import java.util.List; import java.util.Map; import java.util.concurrent.Callable; +import org.checkerframework.checker.nullness.qual.NonNull; final class BigQueryImpl extends BaseService implements BigQuery { @@ -351,6 +353,21 @@ public JobId get() { return create(jobInfo, idProvider, options); } + @Override + @BetaApi + public Connection createConnection(@NonNull ConnectionSettings connectionSettings) + throws BigQueryException { + return new ConnectionImpl(connectionSettings, getOptions(), bigQueryRpc, DEFAULT_RETRY_CONFIG); + } + + @Override + @BetaApi + public Connection createConnection() throws BigQueryException { + ConnectionSettings defaultConnectionSettings = ConnectionSettings.newBuilder().build(); + return new ConnectionImpl( + defaultConnectionSettings, getOptions(), bigQueryRpc, DEFAULT_RETRY_CONFIG); + } + @InternalApi("visible for testing") Job create(JobInfo jobInfo, Supplier idProvider, JobOption... options) { final boolean idRandom = (jobInfo.getJobId() == null); diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryResult.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryResult.java new file mode 100644 index 000000000..6b0c35f67 --- /dev/null +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryResult.java @@ -0,0 +1,38 @@ +/* + * Copyright 2021 Google LLC + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package com.google.cloud.bigquery; + +import java.sql.ResultSet; + +public interface BigQueryResult { + + /** Returns the schema of the results. */ + Schema getSchema(); + + /** + * Returns the total number of rows in the complete result set, which can be more than the number + * of rows in the first page of results. This might return -1 if the query is long running and the + * job is not complete at the time this object is returned. + */ + long getTotalRows(); + + /* Returns the underlying ResultSet Implementation */ + ResultSet getResultSet(); + + /* Returns the query statistics associated with this query. */ + BigQueryResultStats getBigQueryResultStats(); +} diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryResultImpl.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryResultImpl.java new file mode 100644 index 000000000..7c24ca0dd --- /dev/null +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryResultImpl.java @@ -0,0 +1,610 @@ +/* + * Copyright 2021 Google LLC + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package com.google.cloud.bigquery; + +import java.math.BigDecimal; +import java.sql.Date; +import java.sql.ResultSet; +import java.sql.SQLException; +import java.sql.Time; +import java.sql.Timestamp; +import java.time.LocalDateTime; +import java.time.LocalTime; +import java.time.ZoneId; +import java.util.Map; +import java.util.TimeZone; +import java.util.concurrent.BlockingQueue; +import java.util.concurrent.TimeUnit; +import org.apache.arrow.vector.util.JsonStringArrayList; +import org.apache.arrow.vector.util.Text; + +public class BigQueryResultImpl implements BigQueryResult { + + private static final String NULL_CURSOR_MSG = + "Error occurred while reading the cursor. This could happen if getters are called after we are done reading all the records"; + + // This class represents a row of records, the columns are represented as a map + // (columnName:columnValue pair) + static class Row { + private Map value; + private boolean isLast; + + public Row(Map value) { + this.value = value; + } + + public Row(Map value, boolean isLast) { + this.value = value; + this.isLast = isLast; + } + + public Map getValue() { + return value; + } + + public boolean isLast() { + return isLast; + } + + public boolean hasField(String fieldName) { + return this.value.containsKey(fieldName); + } + + public Object get(String fieldName) { + return this.value.get(fieldName); + } + } + + private final Schema schema; + private final long totalRows; + private final BlockingQueue buffer; + private T cursor; + private final BigQueryResultSet underlyingResultSet; + private final BigQueryResultStats bigQueryResultStats; + private final FieldList schemaFieldList; + + public BigQueryResultImpl( + Schema schema, + long totalRows, + BlockingQueue buffer, + BigQueryResultStats bigQueryResultStats) { + this.schema = schema; + this.totalRows = totalRows; + this.buffer = buffer; + this.underlyingResultSet = new BigQueryResultSet(); + this.bigQueryResultStats = bigQueryResultStats; + this.schemaFieldList = schema.getFields(); + } + + @Override + public Schema getSchema() { + return schema; + } + + @Override + public long getTotalRows() { + return totalRows; + } + + @Override + public ResultSet getResultSet() { + return underlyingResultSet; + } + + private class BigQueryResultSet extends AbstractJdbcResultSet { + @Override + /*Advances the result set to the next row, returning false if no such row exists. Potentially blocking operation*/ + public boolean next() throws SQLException { + try { + cursor = buffer.take(); // advance the cursor,Potentially blocking operation + if (isEndOfStream(cursor)) { // check for end of stream + cursor = null; + return false; + } else if (cursor instanceof Row) { + Row curTup = (Row) cursor; + if (curTup.isLast()) { // last Tuple + cursor = null; + return false; + } + return true; + } else if (cursor instanceof FieldValueList) { // cursor is advanced, we can return true now + return true; + } else { // this case should never occur as the cursor will either be a Row of EoS + throw new BigQuerySQLException("Could not process the current row"); + } + } catch (InterruptedException e) { + throw new SQLException( + "Error occurred while advancing the cursor. This could happen when connection is closed while we call the next method"); + } + } + + private boolean isEndOfStream(T cursor) { + return cursor instanceof ConnectionImpl.EndOfFieldValueList; + } + + @Override + public Object getObject(String fieldName) throws SQLException { + if (fieldName == null) { + throw new SQLException("fieldName can't be null"); + } + if (cursor == null) { + throw new BigQuerySQLException(NULL_CURSOR_MSG); + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(fieldName); + return (fieldValue == null || fieldValue.getValue() == null) ? null : fieldValue.getValue(); + } else { // Data received from Read API (Arrow) + Row curRow = (Row) cursor; + if (!curRow.hasField(fieldName)) { + throw new SQLException(String.format("Field %s not found", fieldName)); + } + return curRow.get(fieldName); + } + } + + @Override + public Object getObject(int columnIndex) throws SQLException { + if (cursor == null) { + return null; + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(columnIndex); + return (fieldValue == null || fieldValue.getValue() == null) ? null : fieldValue.getValue(); + } else { // Data received from Read API (Arrow) + return getObject(schemaFieldList.get(columnIndex).getName()); + } + } + + @Override + public String getString(String fieldName) throws SQLException { + if (fieldName == null) { + throw new SQLException("fieldName can't be null"); + } + if (cursor == null) { + throw new BigQuerySQLException(NULL_CURSOR_MSG); + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(fieldName); + if ((fieldValue == null || fieldValue.getValue() == null)) { + return null; + } else if (fieldValue + .getAttribute() + .equals(FieldValue.Attribute.REPEATED)) { // Case for Arrays + return fieldValue.getValue().toString(); + } else { + return fieldValue.getStringValue(); + } + } else { // Data received from Read API (Arrow) + Row curRow = (Row) cursor; + if (!curRow.hasField(fieldName)) { + throw new SQLException(String.format("Field %s not found", fieldName)); + } + Object currentVal = curRow.get(fieldName); + if (currentVal == null) { + return null; + } else if (currentVal instanceof JsonStringArrayList) { // arrays + JsonStringArrayList jsnAry = (JsonStringArrayList) currentVal; + return jsnAry.toString(); + } else if (currentVal instanceof LocalDateTime) { + LocalDateTime dateTime = (LocalDateTime) currentVal; + return dateTime.toString(); + } else { + Text textVal = (Text) currentVal; + return textVal.toString(); + } + } + } + + @Override + public String getString(int columnIndex) throws SQLException { + if (cursor == null) { + return null; + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(columnIndex); + return (fieldValue == null || fieldValue.getValue() == null) + ? null + : fieldValue.getStringValue(); + } else { // Data received from Read API (Arrow) + return getString(schemaFieldList.get(columnIndex).getName()); + } + } + + @Override + public int getInt(String fieldName) throws SQLException { + if (fieldName == null) { + throw new SQLException("fieldName can't be null"); + } + if (cursor == null) { + return 0; // the column value; if the value is SQL NULL, the value returned is 0 as per + // java.sql.ResultSet definition + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(fieldName); + return (fieldValue == null || fieldValue.getValue() == null) + ? 0 + : fieldValue.getNumericValue().intValue(); + } else { // Data received from Read API (Arrow) + + Row curRow = (Row) cursor; + if (!curRow.hasField(fieldName)) { + throw new SQLException(String.format("Field %s not found", fieldName)); + } + Object curVal = curRow.get(fieldName); + if (curVal == null) { + return 0; + } + if (curVal instanceof Text) { // parse from text to int + return Integer.parseInt(((Text) curVal).toString()); + } else if (curVal + instanceof + Long) { // incase getInt is called for a Long value. Loss of precision might occur + return ((Long) curVal).intValue(); + } + return ((BigDecimal) curVal).intValue(); + } + } + + @Override + public int getInt(int columnIndex) throws SQLException { + if (cursor == null) { + return 0; // the column value; if the value is SQL NULL, the value returned is 0 as per + // java.sql.ResultSet definition + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(columnIndex); + return (fieldValue == null || fieldValue.getValue() == null) + ? 0 + : fieldValue.getNumericValue().intValue(); + } else { // Data received from Read API (Arrow) + return getInt(schemaFieldList.get(columnIndex).getName()); + } + } + + @Override + public long getLong(String fieldName) throws SQLException { + if (fieldName == null) { + throw new SQLException("fieldName can't be null"); + } + if (cursor == null) { + throw new BigQuerySQLException(NULL_CURSOR_MSG); + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(fieldName); + return (fieldValue == null || fieldValue.getValue() == null) + ? 0L + : fieldValue.getNumericValue().longValue(); + } else { // Data received from Read API (Arrow) + Row curRow = (Row) cursor; + if (!curRow.hasField(fieldName)) { + throw new SQLException(String.format("Field %s not found", fieldName)); + } + Object curVal = curRow.get(fieldName); + if (curVal == null) { + return 0L; + } else { // value will be Long or BigDecimal, but are Number + return ((Number) curVal).longValue(); + } + } + } + + @Override + public long getLong(int columnIndex) throws SQLException { + if (cursor == null) { + return 0L; // the column value; if the value is SQL NULL, the value returned is 0 as per + // java.sql.ResultSet definition + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(columnIndex); + return (fieldValue == null || fieldValue.getValue() == null) + ? 0L + : fieldValue.getNumericValue().longValue(); + } else { // Data received from Read API (Arrow) + return getInt(schemaFieldList.get(columnIndex).getName()); + } + } + + @Override + public double getDouble(String fieldName) throws SQLException { + if (fieldName == null) { + throw new SQLException("fieldName can't be null"); + } + if (cursor == null) { + throw new BigQuerySQLException(NULL_CURSOR_MSG); + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(fieldName); + return (fieldValue == null || fieldValue.getValue() == null) + ? 0d + : fieldValue.getNumericValue().doubleValue(); + } else { // Data received from Read API (Arrow) + Row curRow = (Row) cursor; + if (!curRow.hasField(fieldName)) { + throw new SQLException(String.format("Field %s not found", fieldName)); + } + Object curVal = curRow.get(fieldName); + return curVal == null ? 0.0d : ((BigDecimal) curVal).doubleValue(); + } + } + + @Override + public double getDouble(int columnIndex) throws SQLException { + if (cursor == null) { + return 0d; // the column value; if the value is SQL NULL, the value returned is 0 as per + // java.sql.ResultSet definition + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(columnIndex); + return (fieldValue == null || fieldValue.getValue() == null) + ? 0d + : fieldValue.getNumericValue().doubleValue(); + } else { // Data received from Read API (Arrow) + return getDouble(schemaFieldList.get(columnIndex).getName()); + } + } + + @Override + public BigDecimal getBigDecimal(String fieldName) throws SQLException { + if (fieldName == null) { + throw new SQLException("fieldName can't be null"); + } + if (cursor == null) { + throw new BigQuerySQLException(NULL_CURSOR_MSG); + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(fieldName); + return (fieldValue == null || fieldValue.getValue() == null) + ? null + : BigDecimal.valueOf(fieldValue.getNumericValue().doubleValue()); + } else { // Data received from Read API (Arrow) + return BigDecimal.valueOf(getDouble(fieldName)); + } + } + + @Override + public BigDecimal getBigDecimal(int columnIndex) throws SQLException { + if (cursor == null) { + throw new BigQuerySQLException(NULL_CURSOR_MSG); + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(columnIndex); + return (fieldValue == null || fieldValue.getValue() == null) + ? null + : BigDecimal.valueOf(fieldValue.getNumericValue().doubleValue()); + } else { // Data received from Read API (Arrow) + return getBigDecimal(schemaFieldList.get(columnIndex).getName()); + } + } + + @Override + public boolean getBoolean(String fieldName) throws SQLException { + if (fieldName == null) { + throw new SQLException("fieldName can't be null"); + } + if (cursor == null) { + throw new BigQuerySQLException(NULL_CURSOR_MSG); + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(fieldName); + return fieldValue.getValue() != null && fieldValue.getBooleanValue(); + } else { // Data received from Read API (Arrow) + Row curRow = (Row) cursor; + if (!curRow.hasField(fieldName)) { + throw new SQLException(String.format("Field %s not found", fieldName)); + } + Object curVal = curRow.get(fieldName); + return curVal != null && (Boolean) curVal; + } + } + + @Override + public boolean getBoolean(int columnIndex) throws SQLException { + if (cursor == null) { + throw new BigQuerySQLException(NULL_CURSOR_MSG); + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(columnIndex); + return fieldValue.getValue() != null && fieldValue.getBooleanValue(); + } else { // Data received from Read API (Arrow) + return getBoolean(schemaFieldList.get(columnIndex).getName()); + } + } + + @Override + public byte[] getBytes(String fieldName) throws SQLException { + if (fieldName == null) { + throw new SQLException("fieldName can't be null"); + } + if (cursor == null) { + throw new BigQuerySQLException(NULL_CURSOR_MSG); + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(fieldName); + return (fieldValue == null || fieldValue.getValue() == null) + ? null + : fieldValue.getBytesValue(); + } else { // Data received from Read API (Arrow) + Row curRow = (Row) cursor; + if (!curRow.hasField(fieldName)) { + throw new SQLException(String.format("Field %s not found", fieldName)); + } + Object curVal = curRow.get(fieldName); + return curVal == null ? null : (byte[]) curVal; + } + } + + @Override + public byte[] getBytes(int columnIndex) throws SQLException { + if (cursor == null) { + return null; // if the value is SQL NULL, the value returned is null + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(columnIndex); + return (fieldValue == null || fieldValue.getValue() == null) + ? null + : fieldValue.getBytesValue(); + } else { // Data received from Read API (Arrow) + return getBytes(schemaFieldList.get(columnIndex).getName()); + } + } + + @Override + public Timestamp getTimestamp(String fieldName) throws SQLException { + if (fieldName == null) { + throw new BigQuerySQLException(NULL_CURSOR_MSG); + } + if (cursor == null) { + return null; // if the value is SQL NULL, the value returned is null + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(fieldName); + return (fieldValue == null || fieldValue.getValue() == null) + ? null + : new Timestamp( + fieldValue.getTimestampValue() + / 1000); // getTimestampValue returns time in microseconds, and TimeStamp + // expects it in millis + } else { + Row curRow = (Row) cursor; + if (!curRow.hasField(fieldName)) { + throw new SQLException(String.format("Field %s not found", fieldName)); + } + Object timeStampVal = curRow.get(fieldName); + return timeStampVal == null + ? null + : new Timestamp((Long) timeStampVal / 1000); // Timestamp is represented as a Long + } + } + + @Override + public Timestamp getTimestamp(int columnIndex) throws SQLException { + if (cursor == null) { + throw new BigQuerySQLException(NULL_CURSOR_MSG); + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(columnIndex); + return (fieldValue == null || fieldValue.getValue() == null) + ? null + : new Timestamp( + fieldValue.getTimestampValue() + / 1000); // getTimestampValue returns time in microseconds, and TimeStamp + // expects it in millis + } else { // Data received from Read API (Arrow) + return getTimestamp(schemaFieldList.get(columnIndex).getName()); + } + } + + @Override + public Time getTime(String fieldName) throws SQLException { + if (fieldName == null) { + throw new BigQuerySQLException(NULL_CURSOR_MSG); + } + if (cursor == null) { + return null; // if the value is SQL NULL, the value returned is null + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(fieldName); + return getTimeFromFieldVal(fieldValue); + } else { // Data received from Read API (Arrow) + Row curRow = (Row) cursor; + if (!curRow.hasField(fieldName)) { + throw new SQLException(String.format("Field %s not found", fieldName)); + } + Object timeStampObj = curRow.get(fieldName); + return timeStampObj == null + ? null + : new Time( + ((Long) timeStampObj) + / 1000); // Time.toString() will return 12:11:35 in GMT as 17:41:35 in + // (GMT+5:30). This can be offset using getTimeZoneOffset + } + } + + private int getTimeZoneOffset() { + TimeZone timeZone = TimeZone.getTimeZone(ZoneId.systemDefault()); + return timeZone.getOffset(new java.util.Date().getTime()); // offset in seconds + } + + @Override + public Time getTime(int columnIndex) throws SQLException { + if (cursor == null) { + throw new BigQuerySQLException(NULL_CURSOR_MSG); + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(columnIndex); + return getTimeFromFieldVal(fieldValue); + } else { // Data received from Read API (Arrow) + return getTime(schemaFieldList.get(columnIndex).getName()); + } + } + + private Time getTimeFromFieldVal(FieldValue fieldValue) throws SQLException { + if (fieldValue.getValue() != null) { + // Time ranges from 00:00:00 to 23:59:59.99999. in BigQuery. Parsing it to java.sql.Time + String strTime = fieldValue.getStringValue(); + String[] timeSplt = strTime.split(":"); + if (timeSplt.length != 3) { + throw new SQLException("Can not parse the value " + strTime + " to java.sql.Time"); + } + int hr = Integer.parseInt(timeSplt[0]); + int min = Integer.parseInt(timeSplt[1]); + int sec = 0, nanoSec = 0; + if (timeSplt[2].contains(".")) { + String[] secSplt = timeSplt[2].split("\\."); + sec = Integer.parseInt(secSplt[0]); + nanoSec = Integer.parseInt(secSplt[1]); + } else { + sec = Integer.parseInt(timeSplt[2]); + } + return Time.valueOf(LocalTime.of(hr, min, sec, nanoSec)); + } else { + return null; + } + } + + @Override + public Date getDate(String fieldName) throws SQLException { + if (fieldName == null) { + throw new SQLException("fieldName can't be null"); + } + if (cursor == null) { + throw new BigQuerySQLException(NULL_CURSOR_MSG); + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(fieldName); + return (fieldValue == null || fieldValue.getValue() == null) + ? null + : Date.valueOf(fieldValue.getStringValue()); + } else { // Data received from Read API (Arrow) + Row curRow = (Row) cursor; + if (!curRow.hasField(fieldName)) { + throw new SQLException(String.format("Field %s not found", fieldName)); + } + Object dateObj = curRow.get(fieldName); + if (dateObj == null) { + return null; + } else { + Integer dateInt = (Integer) dateObj; + long dateInMillis = + TimeUnit.DAYS.toMillis( + Long.valueOf( + dateInt)); // For example int 18993 represents 2022-01-01, converting time to + // milli seconds + return new Date(dateInMillis); + } + } + } + + @Override + public Date getDate(int columnIndex) throws SQLException { + if (cursor == null) { + throw new BigQuerySQLException(NULL_CURSOR_MSG); + } else if (cursor instanceof FieldValueList) { + FieldValue fieldValue = ((FieldValueList) cursor).get(columnIndex); + return (fieldValue == null || fieldValue.getValue() == null) + ? null + : Date.valueOf(fieldValue.getStringValue()); + } else { // Data received from Read API (Arrow) + return getDate(schemaFieldList.get(columnIndex).getName()); + } + } + } + + @Override + public BigQueryResultStats getBigQueryResultStats() { + return bigQueryResultStats; + } +} diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryResultStats.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryResultStats.java new file mode 100644 index 000000000..a4c37a9b6 --- /dev/null +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryResultStats.java @@ -0,0 +1,36 @@ +/* + * Copyright 2021 Google LLC + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package com.google.cloud.bigquery; + +import com.google.api.core.BetaApi; +import com.google.cloud.bigquery.JobStatistics.QueryStatistics; +import com.google.cloud.bigquery.JobStatistics.SessionInfo; + +public interface BigQueryResultStats { + + /** Returns query statistics of a query job */ + @BetaApi + QueryStatistics getQueryStatistics(); + + /** + * Returns SessionInfo contains information about the session if this job is part of one. + * JobStatistics2 model class does not allow setSessionInfo so this cannot be set as part of + * QueryStatistics when we use jobs.query API. + */ + @BetaApi + SessionInfo getSessionInfo(); +} diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryResultStatsImpl.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryResultStatsImpl.java new file mode 100644 index 000000000..53d67f8f3 --- /dev/null +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryResultStatsImpl.java @@ -0,0 +1,41 @@ +/* + * Copyright 2021 Google LLC + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package com.google.cloud.bigquery; + +import com.google.cloud.bigquery.JobStatistics.QueryStatistics; +import com.google.cloud.bigquery.JobStatistics.SessionInfo; + +public class BigQueryResultStatsImpl implements BigQueryResultStats { + + private final QueryStatistics queryStatistics; + private final SessionInfo sessionInfo; + + public BigQueryResultStatsImpl(QueryStatistics queryStatistics, SessionInfo sessionInfo) { + this.queryStatistics = queryStatistics; + this.sessionInfo = sessionInfo; + } + + @Override + public QueryStatistics getQueryStatistics() { + return queryStatistics; + } + + @Override + public SessionInfo getSessionInfo() { + return sessionInfo; + } +} diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQuerySQLException.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQuerySQLException.java new file mode 100644 index 000000000..672c6ad3f --- /dev/null +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQuerySQLException.java @@ -0,0 +1,86 @@ +/* + * Copyright 2021 Google LLC + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package com.google.cloud.bigquery; + +import java.sql.SQLException; +import java.util.List; + +/** + * BigQuery service exception. + * + * @see Google Cloud BigQuery + * error codes + */ +public final class BigQuerySQLException extends SQLException { + + private static final long serialVersionUID = -5006625989225438209L; + private final List errors; + + public BigQuerySQLException() { + this.errors = null; + } + + public BigQuerySQLException( + String msg) { // overloaded constructor with just message as an argument + super(msg); + this.errors = null; + } + + public BigQuerySQLException(List errors) { + this.errors = errors; + } + + public BigQuerySQLException(List errors, Throwable cause) { + super(cause != null ? cause.toString() : null); + this.errors = errors; + } + + public BigQuerySQLException(String reason, List errors) { + super(reason); + this.errors = errors; + } + + public BigQuerySQLException(String reason, Throwable cause, List errors) { + super(reason, cause); + this.errors = errors; + } + + public BigQuerySQLException(String reason, String sqlState, List errors) { + super(reason, sqlState); + this.errors = errors; + } + + public BigQuerySQLException( + String reason, String sqlState, int errorCode, List errors) { + super(reason, sqlState, errorCode); + this.errors = errors; + } + + public BigQuerySQLException( + String reason, String sqlState, int errorCode, Throwable cause, List errors) { + super(reason, sqlState, errorCode, cause); + this.errors = errors; + } + + /** + * Returns a list of {@link BigQueryError}s that caused this exception. Returns {@code null} if + * none exists. + */ + public List getErrors() { + return errors; + } +} diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/Connection.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/Connection.java new file mode 100644 index 000000000..109838d8b --- /dev/null +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/Connection.java @@ -0,0 +1,92 @@ +/* + * Copyright 2021 Google LLC + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package com.google.cloud.bigquery; + +import com.google.api.core.BetaApi; +import java.util.List; +import java.util.Map; + +/** + * A Connection is a session between a Java application and BigQuery. SQL statements are executed + * and results are returned within the context of a connection. + */ +public interface Connection { + + /** Sends a query cancel request. This call will return immediately */ + @BetaApi + boolean close() throws BigQuerySQLException; + + /** + * Execute a query dry run that returns information on the schema and query parameters of the + * query results. + * + * @param sql typically a static SQL SELECT statement + * @exception BigQuerySQLException if a database access error occurs + */ + @BetaApi + BigQueryDryRunResult dryRun(String sql) throws BigQuerySQLException; + + /** + * Execute a SQL statement that returns a single ResultSet. + * + *

Example of running a query. + * + *

+   * {
+   *   @code
+   *   ConnectionSettings connectionSettings =
+   *        ConnectionSettings.newBuilder()
+   *            .setRequestTimeout(10L)
+   *            .setMaxResults(100L)
+   *            .setUseQueryCache(true)
+   *            .build();
+   *   Connection connection = bigquery.createConnection(connectionSettings);
+   *   String selectQuery = "SELECT corpus FROM `bigquery-public-data.samples.shakespeare` GROUP BY corpus;";
+   *   BigQueryResult bqResultSet = connection.executeSelect(selectQuery)
+   *   ResultSet rs = bqResultSet.getResultSet();
+   *   while (rs.next()) {
+   *       System.out.printf("%s,", rs.getString("corpus"));
+   *   }
+   * 
+ * + * @param sql a static SQL SELECT statement + * @return a ResultSet that contains the data produced by the query + * @exception BigQuerySQLException if a database access error occurs + */ + @BetaApi + BigQueryResult executeSelect(String sql) throws BigQuerySQLException; + + /** + * This method executes a SQL SELECT query + * + * @param sql SQL SELECT query + * @param parameters named or positional parameters. The set of query parameters must either be + * all positional or all named parameters. + * @param labels (optional) the labels associated with this query. You can use these to organize + * and group your query jobs. Label keys and values can be no longer than 63 characters, can + * only contain lowercase letters, numeric characters, underscores and dashes. International + * characters are allowed. Label values are optional and Label is a Varargs. You should pass + * all the Labels in a single Map .Label keys must start with a letter and each label in the + * list must have a different key. + * @return BigQueryResult containing the output of the query + * @throws BigQuerySQLException + */ + @BetaApi + BigQueryResult executeSelect( + String sql, List parameters, Map... labels) + throws BigQuerySQLException; +} diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/ConnectionImpl.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/ConnectionImpl.java new file mode 100644 index 000000000..c24a00888 --- /dev/null +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/ConnectionImpl.java @@ -0,0 +1,1240 @@ +/* + * Copyright 2021 Google LLC + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package com.google.cloud.bigquery; + +import static com.google.cloud.RetryHelper.runWithRetries; +import static java.net.HttpURLConnection.HTTP_NOT_FOUND; + +import com.google.api.core.BetaApi; +import com.google.api.services.bigquery.model.GetQueryResultsResponse; +import com.google.api.services.bigquery.model.JobConfigurationQuery; +import com.google.api.services.bigquery.model.QueryParameter; +import com.google.api.services.bigquery.model.QueryRequest; +import com.google.api.services.bigquery.model.TableDataList; +import com.google.api.services.bigquery.model.TableRow; +import com.google.cloud.RetryHelper; +import com.google.cloud.Tuple; +import com.google.cloud.bigquery.JobStatistics.QueryStatistics; +import com.google.cloud.bigquery.JobStatistics.SessionInfo; +import com.google.cloud.bigquery.spi.v2.BigQueryRpc; +import com.google.cloud.bigquery.storage.v1.ArrowRecordBatch; +import com.google.cloud.bigquery.storage.v1.ArrowSchema; +import com.google.cloud.bigquery.storage.v1.BigQueryReadClient; +import com.google.cloud.bigquery.storage.v1.CreateReadSessionRequest; +import com.google.cloud.bigquery.storage.v1.DataFormat; +import com.google.cloud.bigquery.storage.v1.ReadRowsRequest; +import com.google.cloud.bigquery.storage.v1.ReadRowsResponse; +import com.google.cloud.bigquery.storage.v1.ReadSession; +import com.google.common.annotations.VisibleForTesting; +import com.google.common.base.Function; +import com.google.common.base.Strings; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Iterables; +import com.google.common.collect.Lists; +import com.google.common.collect.Maps; +import java.io.IOException; +import java.util.AbstractList; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.Queue; +import java.util.UUID; +import java.util.concurrent.BlockingQueue; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; +import java.util.concurrent.LinkedBlockingDeque; +import java.util.concurrent.TimeUnit; +import java.util.logging.Level; +import java.util.logging.Logger; +import java.util.stream.Collectors; +import org.apache.arrow.memory.BufferAllocator; +import org.apache.arrow.memory.RootAllocator; +import org.apache.arrow.vector.FieldVector; +import org.apache.arrow.vector.VectorLoader; +import org.apache.arrow.vector.VectorSchemaRoot; +import org.apache.arrow.vector.ipc.ReadChannel; +import org.apache.arrow.vector.ipc.message.MessageSerializer; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.util.ByteArrayReadableSeekableByteChannel; + +/** Implementation for {@link Connection}, the generic BigQuery connection API (not JDBC). */ +class ConnectionImpl implements Connection { + + private final ConnectionSettings connectionSettings; + private final BigQueryOptions bigQueryOptions; + private final BigQueryRpc bigQueryRpc; + private final BigQueryRetryConfig retryConfig; + private final int bufferSize; // buffer size in Producer Thread + private final int MAX_PROCESS_QUERY_THREADS_CNT = 5; + private final ExecutorService queryTaskExecutor = + Executors.newFixedThreadPool(MAX_PROCESS_QUERY_THREADS_CNT); + private final Logger logger = Logger.getLogger(this.getClass().getName()); + private BigQueryReadClient bqReadClient; + private static final long EXECUTOR_TIMEOUT_SEC = 5; + + ConnectionImpl( + ConnectionSettings connectionSettings, + BigQueryOptions bigQueryOptions, + BigQueryRpc bigQueryRpc, + BigQueryRetryConfig retryConfig) { + this.connectionSettings = connectionSettings; + this.bigQueryOptions = bigQueryOptions; + this.bigQueryRpc = bigQueryRpc; + this.retryConfig = retryConfig; + // Sets a reasonable buffer size (a blocking queue) if user input is suboptimal + this.bufferSize = + (connectionSettings == null + || connectionSettings.getNumBufferedRows() == null + || connectionSettings.getNumBufferedRows() < 10000 + ? 20000 + : Math.min(connectionSettings.getNumBufferedRows() * 2, 100000)); + } + + /** + * Cancel method shutdowns the pageFetcher and producerWorker threads gracefully using interrupt. + * The pageFetcher threat will not request for any subsequent threads after interrupting and + * shutdown as soon as any ongoing RPC call returns. The producerWorker will not populate the + * buffer with any further records and clear the buffer, put a EoF marker and shutdown. + * + * @return Boolean value true if the threads were interrupted + * @throws BigQuerySQLException + */ + @BetaApi + @Override + public synchronized boolean close() throws BigQuerySQLException { + queryTaskExecutor.shutdownNow(); + try { + queryTaskExecutor.awaitTermination( + EXECUTOR_TIMEOUT_SEC, TimeUnit.SECONDS); // wait for the executor shutdown + } catch (InterruptedException e) { + e.printStackTrace(); + logger.log( + Level.WARNING, + "\n" + Thread.currentThread().getName() + " Exception while awaitTermination", + e); // Logging InterruptedException instead of throwing the exception back, close method + // will return queryTaskExecutor.isShutdown() + } + return queryTaskExecutor.isShutdown(); // check if the executor has been shutdown + } + + /** + * This method runs a dry run query + * + * @param sql SQL SELECT statement + * @return BigQueryDryRunResult containing List and Schema + * @throws BigQuerySQLException + */ + @BetaApi + @Override + public BigQueryDryRunResult dryRun(String sql) throws BigQuerySQLException { + com.google.api.services.bigquery.model.Job dryRunJob = createDryRunJob(sql); + Schema schema = Schema.fromPb(dryRunJob.getStatistics().getQuery().getSchema()); + List queryParametersPb = + dryRunJob.getStatistics().getQuery().getUndeclaredQueryParameters(); + List queryParameters = + Lists.transform(queryParametersPb, QUERY_PARAMETER_FROM_PB_FUNCTION); + QueryStatistics queryStatistics = JobStatistics.fromPb(dryRunJob); + SessionInfo sessionInfo = + queryStatistics.getSessionInfo() == null ? null : queryStatistics.getSessionInfo(); + BigQueryResultStats bigQueryResultStats = + new BigQueryResultStatsImpl(queryStatistics, sessionInfo); + return new BigQueryDryRunResultImpl(schema, queryParameters, bigQueryResultStats); + } + + /** + * This method executes a SQL SELECT query + * + * @param sql SQL SELECT statement + * @return BigQueryResult containing the output of the query + * @throws BigQuerySQLException + */ + @BetaApi + @Override + public BigQueryResult executeSelect(String sql) throws BigQuerySQLException { + try { + // use jobs.query if all the properties of connectionSettings are supported + if (isFastQuerySupported()) { + String projectId = bigQueryOptions.getProjectId(); + QueryRequest queryRequest = createQueryRequest(connectionSettings, sql, null, null); + return queryRpc(projectId, queryRequest, false); + } + // use jobs.insert otherwise + com.google.api.services.bigquery.model.Job queryJob = + createQueryJob(sql, connectionSettings, null, null); + JobId jobId = JobId.fromPb(queryJob.getJobReference()); + GetQueryResultsResponse firstPage = getQueryResultsFirstPage(jobId); + return getResultSet(firstPage, jobId, sql, false); + } catch (BigQueryException e) { + throw new BigQuerySQLException(e.getMessage(), e, e.getErrors()); + } + } + + /** + * This method executes a SQL SELECT query + * + * @param sql SQL SELECT query + * @param parameters named or positional parameters. The set of query parameters must either be + * all positional or all named parameters. + * @param labels the labels associated with this query. You can use these to organize and group + * your query jobs. Label keys and values can be no longer than 63 characters, can only + * contain lowercase letters, numeric characters, underscores and dashes. International + * characters are allowed. Label values are optional and Label is a Varargs. You should pass + * all the Labels in a single Map .Label keys must start with a letter and each label in the + * list must have a different key. + * @return BigQueryResult containing the output of the query + * @throws BigQuerySQLException + */ + @BetaApi + @Override + public BigQueryResult executeSelect( + String sql, List parameters, Map... labels) + throws BigQuerySQLException { + Map labelMap = null; + if (labels != null + && labels.length == 1) { // We expect label as a key value pair in a single Map + labelMap = labels[0]; + } + try { + // use jobs.query if possible + if (isFastQuerySupported()) { + final String projectId = bigQueryOptions.getProjectId(); + final QueryRequest queryRequest = + createQueryRequest(connectionSettings, sql, parameters, labelMap); + return queryRpc(projectId, queryRequest, parameters != null); + } + // use jobs.insert otherwise + com.google.api.services.bigquery.model.Job queryJob = + createQueryJob(sql, connectionSettings, parameters, labelMap); + JobId jobId = JobId.fromPb(queryJob.getJobReference()); + GetQueryResultsResponse firstPage = getQueryResultsFirstPage(jobId); + return getResultSet(firstPage, jobId, sql, parameters != null); + } catch (BigQueryException e) { + throw new BigQuerySQLException(e.getMessage(), e, e.getErrors()); + } + } + + @VisibleForTesting + BigQueryResult getResultSet( + GetQueryResultsResponse firstPage, JobId jobId, String sql, Boolean hasQueryParameters) { + if (firstPage.getJobComplete() + && firstPage.getTotalRows() + != null) { // firstPage.getTotalRows() is null if job is not complete + return getSubsequentQueryResultsWithJob( + firstPage.getTotalRows().longValue(), + (long) firstPage.getRows().size(), + jobId, + firstPage, + hasQueryParameters); + } else { // job is still running, use dryrun to get Schema + com.google.api.services.bigquery.model.Job dryRunJob = createDryRunJob(sql); + Schema schema = Schema.fromPb(dryRunJob.getStatistics().getQuery().getSchema()); + // TODO: check how can we get totalRows and pageRows while the job is still running. + // `firstPage.getTotalRows()` returns null + return getSubsequentQueryResultsWithJob( + null, null, jobId, firstPage, schema, hasQueryParameters); + } + } + + static class EndOfFieldValueList + extends AbstractList< + FieldValue> { // A reference of this class is used as a token to inform the thread + // consuming `buffer` BigQueryResultImpl that we have run out of records + @Override + public FieldValue get(int index) { + return null; + } + + @Override + public int size() { + return 0; + } + } + + private BigQueryResult queryRpc( + final String projectId, final QueryRequest queryRequest, Boolean hasQueryParameters) { + com.google.api.services.bigquery.model.QueryResponse results; + try { + results = + BigQueryRetryHelper.runWithRetries( + () -> bigQueryRpc.queryRpc(projectId, queryRequest), + bigQueryOptions.getRetrySettings(), + BigQueryBaseService.BIGQUERY_EXCEPTION_HANDLER, + bigQueryOptions.getClock(), + retryConfig); + } catch (BigQueryRetryHelper.BigQueryRetryHelperException e) { + throw BigQueryException.translateAndThrow(e); + } + + if (results.getErrors() != null) { + List bigQueryErrors = + results.getErrors().stream() + .map(BigQueryError.FROM_PB_FUNCTION) + .collect(Collectors.toList()); + // Throwing BigQueryException since there may be no JobId, and we want to stay consistent + // with the case where there is an HTTP error + throw new BigQueryException(bigQueryErrors); + } + + // Query finished running and we can paginate all the results + if (results.getJobComplete() && results.getSchema() != null) { + return processQueryResponseResults(results); + } else { + // Query is long-running (> 10s) and hasn't completed yet, or query completed but didn't + // return the schema, fallback to jobs.insert path. Some operations don't return the schema + // and can be optimized here, but this is left as future work. + Long totalRows = results.getTotalRows() == null ? null : results.getTotalRows().longValue(); + Long pageRows = results.getRows() == null ? null : (long) (results.getRows().size()); + JobId jobId = JobId.fromPb(results.getJobReference()); + GetQueryResultsResponse firstPage = getQueryResultsFirstPage(jobId); + return getSubsequentQueryResultsWithJob( + totalRows, pageRows, jobId, firstPage, hasQueryParameters); + } + } + + @VisibleForTesting + BigQueryResultStats getBigQueryResultSetStats(JobId jobId) { + // Create GetQueryResultsResponse query statistics + Job queryJob = getQueryJobRpc(jobId); + QueryStatistics queryStatistics = queryJob.getStatistics(); + SessionInfo sessionInfo = + queryStatistics.getSessionInfo() == null ? null : queryStatistics.getSessionInfo(); + return new BigQueryResultStatsImpl(queryStatistics, sessionInfo); + } + /* This method processed the first page of GetQueryResultsResponse and then it uses tabledata.list */ + @VisibleForTesting + BigQueryResult tableDataList(GetQueryResultsResponse firstPage, JobId jobId) { + Schema schema; + long numRows; + schema = Schema.fromPb(firstPage.getSchema()); + numRows = firstPage.getTotalRows().longValue(); + + BigQueryResultStats bigQueryResultStats = getBigQueryResultSetStats(jobId); + + // Keeps the deserialized records at the row level, which is consumed by BigQueryResult + BlockingQueue> buffer = new LinkedBlockingDeque<>(bufferSize); + + // Keeps the parsed FieldValueLists + BlockingQueue, Boolean>> pageCache = + new LinkedBlockingDeque<>( + getPageCacheSize(connectionSettings.getNumBufferedRows(), schema)); + + // Keeps the raw RPC responses + BlockingQueue> rpcResponseQueue = + new LinkedBlockingDeque<>( + getPageCacheSize(connectionSettings.getNumBufferedRows(), schema)); + + runNextPageTaskAsync(firstPage.getPageToken(), getDestinationTable(jobId), rpcResponseQueue); + + parseRpcDataAsync( + firstPage.getRows(), + schema, + pageCache, + rpcResponseQueue); // parses data on a separate thread, thus maximising processing + // throughput + + populateBufferAsync( + rpcResponseQueue, pageCache, buffer); // spawns a thread to populate the buffer + + // This will work for pagination as well, as buffer is getting updated asynchronously + return new BigQueryResultImpl>( + schema, numRows, buffer, bigQueryResultStats); + } + + @VisibleForTesting + BigQueryResult processQueryResponseResults( + com.google.api.services.bigquery.model.QueryResponse results) { + Schema schema; + long numRows; + schema = Schema.fromPb(results.getSchema()); + numRows = + results.getTotalRows() == null + ? 0 + : results.getTotalRows().longValue(); // in case of DML or DDL + // QueryResponse only provides cache hits, dmlStats, and sessionInfo as query processing + // statistics + DmlStats dmlStats = + results.getDmlStats() == null ? null : DmlStats.fromPb(results.getDmlStats()); + Boolean cacheHit = results.getCacheHit(); + QueryStatistics queryStatistics = + QueryStatistics.newBuilder().setDmlStats(dmlStats).setCacheHit(cacheHit).build(); + // We cannot directly set sessionInfo in QueryStatistics + SessionInfo sessionInfo = + results.getSessionInfo() == null + ? null + : JobStatistics.SessionInfo.fromPb(results.getSessionInfo()); + BigQueryResultStats bigQueryResultStats = + new BigQueryResultStatsImpl(queryStatistics, sessionInfo); + + BlockingQueue> buffer = new LinkedBlockingDeque<>(bufferSize); + BlockingQueue, Boolean>> pageCache = + new LinkedBlockingDeque<>( + getPageCacheSize(connectionSettings.getNumBufferedRows(), schema)); + BlockingQueue> rpcResponseQueue = + new LinkedBlockingDeque<>( + getPageCacheSize(connectionSettings.getNumBufferedRows(), schema)); + + JobId jobId = JobId.fromPb(results.getJobReference()); + + // Thread to make rpc calls to fetch data from the server + runNextPageTaskAsync(results.getPageToken(), getDestinationTable(jobId), rpcResponseQueue); + + // Thread to parse data received from the server to client library objects + parseRpcDataAsync(results.getRows(), schema, pageCache, rpcResponseQueue); + + // Thread to populate the buffer (a blocking queue) shared with the consumer + populateBufferAsync(rpcResponseQueue, pageCache, buffer); + + return new BigQueryResultImpl>( + schema, numRows, buffer, bigQueryResultStats); + } + + @VisibleForTesting + void runNextPageTaskAsync( + String firstPageToken, + TableId destinationTable, + BlockingQueue> rpcResponseQueue) { + // This thread makes the RPC calls and paginates + Runnable nextPageTask = + () -> { + String pageToken = firstPageToken; // results.getPageToken(); + try { + while (pageToken != null) { // paginate for non null token + if (Thread.currentThread().isInterrupted() + || queryTaskExecutor.isShutdown()) { // do not process further pages and shutdown + break; + } + TableDataList tabledataList = tableDataListRpc(destinationTable, pageToken); + pageToken = tabledataList.getPageToken(); + rpcResponseQueue.put( + Tuple.of( + tabledataList, + true)); // this will be parsed asynchronously without blocking the current + // thread + } + rpcResponseQueue.put( + Tuple.of( + null, + false)); // this will stop the parseDataTask as well in case of interrupt or + // when the pagination completes + } catch (Exception e) { + throw new BigQueryException(0, e.getMessage(), e); + } + }; + queryTaskExecutor.execute(nextPageTask); + } + + /* + This method takes TableDataList from rpcResponseQueue and populates pageCache with FieldValueList + */ + @VisibleForTesting + void parseRpcDataAsync( + // com.google.api.services.bigquery.model.QueryResponse results, + List tableRows, + Schema schema, + BlockingQueue, Boolean>> pageCache, + BlockingQueue> rpcResponseQueue) { + + // parse and put the first page in the pageCache before the other pages are parsed from the RPC + // calls + Iterable firstFieldValueLists = getIterableFieldValueList(tableRows, schema); + try { + pageCache.put( + Tuple.of(firstFieldValueLists, true)); // this is the first page which we have received. + } catch (InterruptedException e) { + throw new BigQueryException(0, e.getMessage(), e); + } + + // rpcResponseQueue will get null tuple if Cancel method is called, so no need to explicitly use + // thread interrupt here + Runnable parseDataTask = + () -> { + try { + boolean hasMorePages = true; + while (hasMorePages) { + Tuple rpcResponse = rpcResponseQueue.take(); + TableDataList tabledataList = rpcResponse.x(); + hasMorePages = rpcResponse.y(); + if (tabledataList != null) { + Iterable fieldValueLists = + getIterableFieldValueList(tabledataList.getRows(), schema); // Parse + pageCache.put(Tuple.of(fieldValueLists, true)); + } + } + } catch (InterruptedException e) { + logger.log( + Level.WARNING, + "\n" + Thread.currentThread().getName() + " Interrupted", + e); // Thread might get interrupted while calling the Cancel method, which is + // expected, so logging this instead of throwing the exception back + } + try { + pageCache.put(Tuple.of(null, false)); // no further pages + } catch (InterruptedException e) { + logger.log( + Level.WARNING, + "\n" + Thread.currentThread().getName() + " Interrupted", + e); // Thread might get interrupted while calling the Cancel method, which is + // expected, so logging this instead of throwing the exception back + } + }; + queryTaskExecutor.execute(parseDataTask); + } + + @VisibleForTesting + void populateBufferAsync( + BlockingQueue> rpcResponseQueue, + BlockingQueue, Boolean>> pageCache, + BlockingQueue> buffer) { + Runnable populateBufferRunnable = + () -> { // producer thread populating the buffer + Iterable fieldValueLists = null; + boolean hasRows = true; // as we have to process the first page + while (hasRows) { + try { + Tuple, Boolean> nextPageTuple = pageCache.take(); + hasRows = nextPageTuple.y(); + fieldValueLists = nextPageTuple.x(); + } catch (InterruptedException e) { + logger.log( + Level.WARNING, + "\n" + Thread.currentThread().getName() + " Interrupted", + e); // Thread might get interrupted while calling the Cancel method, which is + // expected, so logging this instead of throwing the exception back + } + + if (Thread.currentThread().isInterrupted() + || fieldValueLists + == null) { // do not process further pages and shutdown (outerloop) + break; + } + + for (FieldValueList fieldValueList : fieldValueLists) { + try { + if (Thread.currentThread() + .isInterrupted()) { // do not process further pages and shutdown (inner loop) + break; + } + buffer.put(fieldValueList); + } catch (InterruptedException e) { + throw new BigQueryException(0, e.getMessage(), e); + } + } + } + + if (Thread.currentThread() + .isInterrupted()) { // clear the buffer for any outstanding records + buffer.clear(); + rpcResponseQueue + .clear(); // IMP - so that if it's full then it unblocks and the interrupt logic + // could trigger + } + + try { + buffer.put( + new EndOfFieldValueList()); // All the pages has been processed, put this marker + } catch (InterruptedException e) { + throw new BigQueryException(0, e.getMessage(), e); + } finally { + queryTaskExecutor.shutdownNow(); // Shutdown the thread pool + } + }; + + queryTaskExecutor.execute(populateBufferRunnable); + } + + /* Helper method that parse and populate a page with TableRows */ + private static Iterable getIterableFieldValueList( + Iterable tableDataPb, final Schema schema) { + return ImmutableList.copyOf( + Iterables.transform( + tableDataPb != null ? tableDataPb : ImmutableList.of(), + new Function() { + final FieldList fields = schema != null ? schema.getFields() : null; + + @Override + public FieldValueList apply(TableRow rowPb) { + return FieldValueList.fromPb(rowPb.getF(), fields); + } + })); + } + + /* Helper method that determines the optimal number of caches pages to improve read performance */ + @VisibleForTesting + int getPageCacheSize(Integer numBufferedRows, Schema schema) { + final int MIN_CACHE_SIZE = 3; // Min number of pages to cache + final int MAX_CACHE_SIZE = 20; // //Min number of pages to cache + int numColumns = schema.getFields().size(); + int numCachedPages; + long numCachedRows = numBufferedRows == null ? 0 : numBufferedRows.longValue(); + + // TODO: Further enhance this logic depending on customer feedback on memory consumption + if (numCachedRows > 10000) { + numCachedPages = + 2; // the size of numBufferedRows is quite large and as per our tests we should be able to + // do enough even with low + } else if (numColumns > 15 + && numCachedRows + > 5000) { // too many fields are being read, setting the page size on the lower end + numCachedPages = 3; + } else if (numCachedRows < 2000 + && numColumns < 15) { // low pagesize with fewer number of columns, we can cache more pages + numCachedPages = 20; + } else { // default - under 10K numCachedRows with any number of columns + numCachedPages = 5; + } + return numCachedPages < MIN_CACHE_SIZE + ? MIN_CACHE_SIZE + : (Math.min( + numCachedPages, + MAX_CACHE_SIZE)); // numCachedPages should be between the defined min and max + } + + /* Returns query results using either tabledata.list or the high throughput Read API */ + @VisibleForTesting + BigQueryResult getSubsequentQueryResultsWithJob( + Long totalRows, + Long pageRows, + JobId jobId, + GetQueryResultsResponse firstPage, + Boolean hasQueryParameters) { + TableId destinationTable = getDestinationTable(jobId); + return useReadAPI(totalRows, pageRows, Schema.fromPb(firstPage.getSchema()), hasQueryParameters) + ? highThroughPutRead( + destinationTable, + firstPage.getTotalRows().longValue(), + Schema.fromPb(firstPage.getSchema()), + getBigQueryResultSetStats( + jobId)) // discord first page and stream the entire BigQueryResult using + // the Read API + : tableDataList(firstPage, jobId); + } + + /* Returns query results using either tabledata.list or the high throughput Read API */ + @VisibleForTesting + BigQueryResult getSubsequentQueryResultsWithJob( + Long totalRows, + Long pageRows, + JobId jobId, + GetQueryResultsResponse firstPage, + Schema schema, + Boolean hasQueryParameters) { + TableId destinationTable = getDestinationTable(jobId); + return useReadAPI(totalRows, pageRows, schema, hasQueryParameters) + ? highThroughPutRead( + destinationTable, + totalRows == null + ? -1L + : totalRows, // totalRows is null when the job is still running. TODO: Check if + // any workaround is possible + schema, + getBigQueryResultSetStats( + jobId)) // discord first page and stream the entire BigQueryResult using + // the Read API + : tableDataList(firstPage, jobId); + } + + /* Returns Job from jobId by calling the jobs.get API */ + private Job getQueryJobRpc(JobId jobId) { + final JobId completeJobId = + jobId + .setProjectId(bigQueryOptions.getProjectId()) + .setLocation( + jobId.getLocation() == null && bigQueryOptions.getLocation() != null + ? bigQueryOptions.getLocation() + : jobId.getLocation()); + com.google.api.services.bigquery.model.Job jobPb; + try { + jobPb = + runWithRetries( + () -> + bigQueryRpc.getQueryJob( + completeJobId.getProject(), + completeJobId.getJob(), + completeJobId.getLocation()), + bigQueryOptions.getRetrySettings(), + BigQueryBaseService.BIGQUERY_EXCEPTION_HANDLER, + bigQueryOptions.getClock()); + if (bigQueryOptions.getThrowNotFound() && jobPb == null) { + throw new BigQueryException(HTTP_NOT_FOUND, "Query job not found"); + } + } catch (RetryHelper.RetryHelperException e) { + throw BigQueryException.translateAndThrow(e); + } + return Job.fromPb(bigQueryOptions.getService(), jobPb); + } + + /* Returns the destinationTable from jobId by calling jobs.get API */ + @VisibleForTesting + TableId getDestinationTable(JobId jobId) { + Job job = getQueryJobRpc(jobId); + return ((QueryJobConfiguration) job.getConfiguration()).getDestinationTable(); + } + + @VisibleForTesting + TableDataList tableDataListRpc(TableId destinationTable, String pageToken) { + try { + final TableId completeTableId = + destinationTable.setProjectId( + Strings.isNullOrEmpty(destinationTable.getProject()) + ? bigQueryOptions.getProjectId() + : destinationTable.getProject()); + TableDataList results = + runWithRetries( + () -> + bigQueryOptions + .getBigQueryRpcV2() + .listTableDataWithRowLimit( + completeTableId.getProject(), + completeTableId.getDataset(), + completeTableId.getTable(), + connectionSettings.getMaxResultPerPage(), + pageToken), + bigQueryOptions.getRetrySettings(), + BigQueryBaseService.BIGQUERY_EXCEPTION_HANDLER, + bigQueryOptions.getClock()); + + return results; + } catch (RetryHelper.RetryHelperException e) { + throw BigQueryException.translateAndThrow(e); + } + } + + @VisibleForTesting + BigQueryResult highThroughPutRead( + TableId destinationTable, long totalRows, Schema schema, BigQueryResultStats stats) { + + try { + if (bqReadClient == null) { // if the read client isn't already initialized. Not thread safe. + bqReadClient = BigQueryReadClient.create(); + } + String parent = String.format("projects/%s", destinationTable.getProject()); + String srcTable = + String.format( + "projects/%s/datasets/%s/tables/%s", + destinationTable.getProject(), + destinationTable.getDataset(), + destinationTable.getTable()); + + // Read all the columns if the source table (temp table) and stream the data back in Arrow + // format + ReadSession.Builder sessionBuilder = + ReadSession.newBuilder().setTable(srcTable).setDataFormat(DataFormat.ARROW); + + CreateReadSessionRequest.Builder builder = + CreateReadSessionRequest.newBuilder() + .setParent(parent) + .setReadSession(sessionBuilder) + .setMaxStreamCount(1) // Currently just one stream is allowed + // DO a regex check using order by and use multiple streams + ; + + ReadSession readSession = bqReadClient.createReadSession(builder.build()); + BlockingQueue buffer = new LinkedBlockingDeque<>(bufferSize); + Map arrowNameToIndex = new HashMap<>(); + // deserialize and populate the buffer async, so that the client isn't blocked + processArrowStreamAsync( + readSession, + buffer, + new ArrowRowReader(readSession.getArrowSchema(), arrowNameToIndex), + schema); + + logger.log(Level.INFO, "\n Using BigQuery Read API"); + return new BigQueryResultImpl(schema, totalRows, buffer, stats); + + } catch (IOException e) { + throw BigQueryException.translateAndThrow(e); + } + } + + private void processArrowStreamAsync( + ReadSession readSession, + BlockingQueue buffer, + ArrowRowReader reader, + Schema schema) { + + Runnable arrowStreamProcessor = + () -> { + try { + // Use the first stream to perform reading. + String streamName = readSession.getStreams(0).getName(); + ReadRowsRequest readRowsRequest = + ReadRowsRequest.newBuilder().setReadStream(streamName).build(); + + // Process each block of rows as they arrive and decode using our simple row reader. + com.google.api.gax.rpc.ServerStream stream = + bqReadClient.readRowsCallable().call(readRowsRequest); + for (ReadRowsResponse response : stream) { + if (Thread.currentThread().isInterrupted() + || queryTaskExecutor.isShutdown()) { // do not process and shutdown + break; + } + reader.processRows(response.getArrowRecordBatch(), buffer, schema); + } + + } catch (Exception e) { + throw BigQueryException.translateAndThrow(e); + } finally { + try { + buffer.put(new BigQueryResultImpl.Row(null, true)); // marking end of stream + queryTaskExecutor.shutdownNow(); // Shutdown the thread pool + } catch (InterruptedException e) { + logger.log(Level.WARNING, "\n Error occurred ", e); + } + } + }; + + queryTaskExecutor.execute(arrowStreamProcessor); + } + + private class ArrowRowReader + implements AutoCloseable { // TODO: Update to recent version of Arrow to avoid memoryleak + + BufferAllocator allocator = new RootAllocator(Long.MAX_VALUE); + + // Decoder object will be reused to avoid re-allocation and too much garbage collection. + private final VectorSchemaRoot root; + private final VectorLoader loader; + + private ArrowRowReader(ArrowSchema arrowSchema, Map arrowNameToIndex) + throws IOException { + org.apache.arrow.vector.types.pojo.Schema schema = + MessageSerializer.deserializeSchema( + new org.apache.arrow.vector.ipc.ReadChannel( + new ByteArrayReadableSeekableByteChannel( + arrowSchema.getSerializedSchema().toByteArray()))); + List vectors = new ArrayList<>(); + List fields = schema.getFields(); + for (int i = 0; i < fields.size(); i++) { + vectors.add(fields.get(i).createVector(allocator)); + arrowNameToIndex.put( + fields.get(i).getName(), + i); // mapping for getting against the field name in the result set + } + root = new VectorSchemaRoot(vectors); + loader = new VectorLoader(root); + } + + /** @param batch object returned from the ReadRowsResponse. */ + private void processRows( + ArrowRecordBatch batch, BlockingQueue buffer, Schema schema) + throws IOException { // deserialize the values and consume the hash of the values + try { + org.apache.arrow.vector.ipc.message.ArrowRecordBatch deserializedBatch = + MessageSerializer.deserializeRecordBatch( + new ReadChannel( + new ByteArrayReadableSeekableByteChannel( + batch.getSerializedRecordBatch().toByteArray())), + allocator); + + loader.load(deserializedBatch); + // Release buffers from batch (they are still held in the vectors in root). + deserializedBatch.close(); + + // Parse the vectors using BQ Schema. Deserialize the data at the row level and add it to + // the + // buffer + FieldList fields = schema.getFields(); + for (int rowNum = 0; + rowNum < root.getRowCount(); + rowNum++) { // for the given number of rows in the batch + + if (Thread.currentThread().isInterrupted() + || queryTaskExecutor.isShutdown()) { // do not process and shutdown + break; // exit the loop, root will be cleared in the finally block + } + + Map curRow = new HashMap<>(); + for (int col = 0; col < fields.size(); col++) { // iterate all the vectors for a given row + com.google.cloud.bigquery.Field field = fields.get(col); + FieldVector curFieldVec = + root.getVector( + field.getName()); // can be accessed using the index or Vector/column name + curRow.put(field.getName(), curFieldVec.getObject(rowNum)); // Added the raw value + } + buffer.put(new BigQueryResultImpl.Row(curRow)); + } + root.clear(); // TODO: make sure to clear the root while implementing the thread + // interruption logic (Connection.close method) + + } catch (RuntimeException | InterruptedException e) { + throw BigQueryException.translateAndThrow(e); + } finally { + try { + root.clear(); + } catch (RuntimeException e) { + logger.log(Level.WARNING, "\n Error while clearing VectorSchemaRoot ", e); + } + } + } + + @Override + public void close() { + root.close(); + allocator.close(); + } + } + /*Returns just the first page of GetQueryResultsResponse using the jobId*/ + @VisibleForTesting + GetQueryResultsResponse getQueryResultsFirstPage(JobId jobId) { + JobId completeJobId = + jobId + .setProjectId(bigQueryOptions.getProjectId()) + .setLocation( + jobId.getLocation() == null && bigQueryOptions.getLocation() != null + ? bigQueryOptions.getLocation() + : jobId.getLocation()); + try { + GetQueryResultsResponse results = + BigQueryRetryHelper.runWithRetries( + () -> + bigQueryRpc.getQueryResultsWithRowLimit( + completeJobId.getProject(), + completeJobId.getJob(), + completeJobId.getLocation(), + connectionSettings.getMaxResultPerPage()), + bigQueryOptions.getRetrySettings(), + BigQueryBaseService.BIGQUERY_EXCEPTION_HANDLER, + bigQueryOptions.getClock(), + retryConfig); + + if (results.getErrors() != null) { + List bigQueryErrors = + results.getErrors().stream() + .map(BigQueryError.FROM_PB_FUNCTION) + .collect(Collectors.toList()); + // Throwing BigQueryException since there may be no JobId and we want to stay consistent + // with the case where there there is a HTTP error + throw new BigQueryException(bigQueryErrors); + } + return results; + } catch (BigQueryRetryHelper.BigQueryRetryHelperException e) { + throw BigQueryException.translateAndThrow(e); + } + } + + @VisibleForTesting + boolean isFastQuerySupported() { + // TODO: add regex logic to check for scripting + return connectionSettings.getClustering() == null + && connectionSettings.getCreateDisposition() == null + && connectionSettings.getDestinationEncryptionConfiguration() == null + && connectionSettings.getDestinationTable() == null + && connectionSettings.getJobTimeoutMs() == null + && connectionSettings.getMaximumBillingTier() == null + && connectionSettings.getPriority() == null + && connectionSettings.getRangePartitioning() == null + && connectionSettings.getSchemaUpdateOptions() == null + && connectionSettings.getTableDefinitions() == null + && connectionSettings.getTimePartitioning() == null + && connectionSettings.getUserDefinedFunctions() == null + && connectionSettings.getWriteDisposition() == null; + } + + @VisibleForTesting + boolean useReadAPI(Long totalRows, Long pageRows, Schema schema, Boolean hasQueryParameters) { + + // TODO(prasmish) get this logic review - totalRows and pageRows are returned null when the job + // is not complete + if ((totalRows == null || pageRows == null) + && Boolean.TRUE.equals( + connectionSettings + .getUseReadAPI())) { // totalRows and pageRows are returned null when the job is not + // complete + return true; + } + + // Schema schema = Schema.fromPb(tableSchema); + // Read API does not yet support Interval Type or QueryParameters + if (containsIntervalType(schema) || hasQueryParameters) { + logger.log(Level.INFO, "\n Schema has IntervalType, or QueryParameters. Disabling ReadAPI"); + return false; + } + + long resultRatio = totalRows / pageRows; + if (Boolean.TRUE.equals(connectionSettings.getUseReadAPI())) { + return resultRatio >= connectionSettings.getTotalToPageRowCountRatio() + && totalRows > connectionSettings.getMinResultSize(); + } else { + return false; + } + } + + // Does a BFS iteration to find out if there's an interval type in the schema. Implementation to + // be used until ReadAPI supports IntervalType + private boolean containsIntervalType(Schema schema) { + Queue fields = + new LinkedList(schema.getFields()); + while (!fields.isEmpty()) { + com.google.cloud.bigquery.Field curField = fields.poll(); + if (curField.getType().getStandardType() == StandardSQLTypeName.INTERVAL) { + return true; + } else if (curField.getType().getStandardType() == StandardSQLTypeName.STRUCT + || curField.getType().getStandardType() == StandardSQLTypeName.ARRAY) { + fields.addAll(curField.getSubFields()); + } + } + return false; + } + + // Used for job.query API endpoint + @VisibleForTesting + QueryRequest createQueryRequest( + ConnectionSettings connectionSettings, + String sql, + List queryParameters, + Map labels) { + QueryRequest content = new QueryRequest(); + String requestId = UUID.randomUUID().toString(); + + if (connectionSettings.getConnectionProperties() != null) { + content.setConnectionProperties( + connectionSettings.getConnectionProperties().stream() + .map(ConnectionProperty.TO_PB_FUNCTION) + .collect(Collectors.toList())); + } + if (connectionSettings.getDefaultDataset() != null) { + content.setDefaultDataset(connectionSettings.getDefaultDataset().toPb()); + } + if (connectionSettings.getMaximumBytesBilled() != null) { + content.setMaximumBytesBilled(connectionSettings.getMaximumBytesBilled()); + } + if (connectionSettings.getMaxResults() != null) { + content.setMaxResults(connectionSettings.getMaxResults()); + } + if (queryParameters != null) { + // content.setQueryParameters(queryParameters); + if (queryParameters.get(0).getName() == null) { + // If query parameter name is unset, then assume mode is positional + content.setParameterMode("POSITIONAL"); + // pass query parameters + List queryParametersPb = + Lists.transform(queryParameters, POSITIONAL_PARAMETER_TO_PB_FUNCTION); + content.setQueryParameters(queryParametersPb); + } else { + content.setParameterMode("NAMED"); + // pass query parameters + List queryParametersPb = + Lists.transform(queryParameters, NAMED_PARAMETER_TO_PB_FUNCTION); + content.setQueryParameters(queryParametersPb); + } + } + if (connectionSettings.getCreateSession() != null) { + content.setCreateSession(connectionSettings.getCreateSession()); + } + if (labels != null) { + content.setLabels(labels); + } + content.setQuery(sql); + content.setRequestId(requestId); + // The new Connection interface only supports StandardSQL dialect + content.setUseLegacySql(false); + return content; + } + + // Used by jobs.getQueryResults API endpoint + @VisibleForTesting + com.google.api.services.bigquery.model.Job createQueryJob( + String sql, + ConnectionSettings connectionSettings, + List queryParameters, + Map labels) { + com.google.api.services.bigquery.model.JobConfiguration configurationPb = + new com.google.api.services.bigquery.model.JobConfiguration(); + JobConfigurationQuery queryConfigurationPb = new JobConfigurationQuery(); + queryConfigurationPb.setQuery(sql); + if (queryParameters != null) { + if (queryParameters.get(0).getName() == null) { + // If query parameter name is unset, then assume mode is positional + queryConfigurationPb.setParameterMode("POSITIONAL"); + // pass query parameters + List queryParametersPb = + Lists.transform(queryParameters, POSITIONAL_PARAMETER_TO_PB_FUNCTION); + queryConfigurationPb.setQueryParameters(queryParametersPb); + } else { + queryConfigurationPb.setParameterMode("NAMED"); + // pass query parameters + List queryParametersPb = + Lists.transform(queryParameters, NAMED_PARAMETER_TO_PB_FUNCTION); + queryConfigurationPb.setQueryParameters(queryParametersPb); + } + } + if (connectionSettings.getDestinationTable() != null) { + queryConfigurationPb.setDestinationTable(connectionSettings.getDestinationTable().toPb()); + } + if (connectionSettings.getTableDefinitions() != null) { + queryConfigurationPb.setTableDefinitions( + Maps.transformValues( + connectionSettings.getTableDefinitions(), + ExternalTableDefinition.TO_EXTERNAL_DATA_FUNCTION)); + } + if (connectionSettings.getUserDefinedFunctions() != null) { + queryConfigurationPb.setUserDefinedFunctionResources( + connectionSettings.getUserDefinedFunctions().stream() + .map(UserDefinedFunction.TO_PB_FUNCTION) + .collect(Collectors.toList())); + } + if (connectionSettings.getCreateDisposition() != null) { + queryConfigurationPb.setCreateDisposition( + connectionSettings.getCreateDisposition().toString()); + } + if (connectionSettings.getWriteDisposition() != null) { + queryConfigurationPb.setWriteDisposition(connectionSettings.getWriteDisposition().toString()); + } + if (connectionSettings.getDefaultDataset() != null) { + queryConfigurationPb.setDefaultDataset(connectionSettings.getDefaultDataset().toPb()); + } + if (connectionSettings.getPriority() != null) { + queryConfigurationPb.setPriority(connectionSettings.getPriority().toString()); + } + if (connectionSettings.getAllowLargeResults() != null) { + queryConfigurationPb.setAllowLargeResults(connectionSettings.getAllowLargeResults()); + } + if (connectionSettings.getUseQueryCache() != null) { + queryConfigurationPb.setUseQueryCache(connectionSettings.getUseQueryCache()); + } + if (connectionSettings.getFlattenResults() != null) { + queryConfigurationPb.setFlattenResults(connectionSettings.getFlattenResults()); + } + if (connectionSettings.getMaximumBillingTier() != null) { + queryConfigurationPb.setMaximumBillingTier(connectionSettings.getMaximumBillingTier()); + } + if (connectionSettings.getMaximumBytesBilled() != null) { + queryConfigurationPb.setMaximumBytesBilled(connectionSettings.getMaximumBytesBilled()); + } + if (connectionSettings.getSchemaUpdateOptions() != null) { + ImmutableList.Builder schemaUpdateOptionsBuilder = new ImmutableList.Builder<>(); + for (JobInfo.SchemaUpdateOption schemaUpdateOption : + connectionSettings.getSchemaUpdateOptions()) { + schemaUpdateOptionsBuilder.add(schemaUpdateOption.name()); + } + queryConfigurationPb.setSchemaUpdateOptions(schemaUpdateOptionsBuilder.build()); + } + if (connectionSettings.getDestinationEncryptionConfiguration() != null) { + queryConfigurationPb.setDestinationEncryptionConfiguration( + connectionSettings.getDestinationEncryptionConfiguration().toPb()); + } + if (connectionSettings.getTimePartitioning() != null) { + queryConfigurationPb.setTimePartitioning(connectionSettings.getTimePartitioning().toPb()); + } + if (connectionSettings.getClustering() != null) { + queryConfigurationPb.setClustering(connectionSettings.getClustering().toPb()); + } + if (connectionSettings.getRangePartitioning() != null) { + queryConfigurationPb.setRangePartitioning(connectionSettings.getRangePartitioning().toPb()); + } + if (connectionSettings.getConnectionProperties() != null) { + queryConfigurationPb.setConnectionProperties( + connectionSettings.getConnectionProperties().stream() + .map(ConnectionProperty.TO_PB_FUNCTION) + .collect(Collectors.toList())); + } + if (connectionSettings.getCreateSession() != null) { + queryConfigurationPb.setCreateSession(connectionSettings.getCreateSession()); + } + if (connectionSettings.getJobTimeoutMs() != null) { + configurationPb.setJobTimeoutMs(connectionSettings.getJobTimeoutMs()); + } + if (labels != null) { + configurationPb.setLabels(labels); + } + // The new Connection interface only supports StandardSQL dialect + queryConfigurationPb.setUseLegacySql(false); + configurationPb.setQuery(queryConfigurationPb); + + com.google.api.services.bigquery.model.Job jobPb = + JobInfo.of(QueryJobConfiguration.fromPb(configurationPb)).toPb(); + com.google.api.services.bigquery.model.Job queryJob; + try { + queryJob = + BigQueryRetryHelper.runWithRetries( + () -> bigQueryRpc.createJobForQuery(jobPb), + bigQueryOptions.getRetrySettings(), + BigQueryBaseService.BIGQUERY_EXCEPTION_HANDLER, + bigQueryOptions.getClock(), + retryConfig); + } catch (BigQueryRetryHelper.BigQueryRetryHelperException e) { + throw BigQueryException.translateAndThrow(e); + } + return queryJob; + } + + // Used by dryRun + private com.google.api.services.bigquery.model.Job createDryRunJob(String sql) { + com.google.api.services.bigquery.model.JobConfiguration configurationPb = + new com.google.api.services.bigquery.model.JobConfiguration(); + configurationPb.setDryRun(true); + JobConfigurationQuery queryConfigurationPb = new JobConfigurationQuery(); + String parameterMode = sql.contains("?") ? "POSITIONAL" : "NAMED"; + queryConfigurationPb.setParameterMode(parameterMode); + queryConfigurationPb.setQuery(sql); + // UndeclaredQueryParameter is only supported in StandardSQL + queryConfigurationPb.setUseLegacySql(false); + if (connectionSettings.getDefaultDataset() != null) { + queryConfigurationPb.setDefaultDataset(connectionSettings.getDefaultDataset().toPb()); + } + if (connectionSettings.getCreateSession() != null) { + queryConfigurationPb.setCreateSession(connectionSettings.getCreateSession()); + } + configurationPb.setQuery(queryConfigurationPb); + + com.google.api.services.bigquery.model.Job jobPb = + JobInfo.of(QueryJobConfiguration.fromPb(configurationPb)).toPb(); + + com.google.api.services.bigquery.model.Job dryRunJob; + try { + dryRunJob = + BigQueryRetryHelper.runWithRetries( + () -> bigQueryRpc.createJobForQuery(jobPb), + bigQueryOptions.getRetrySettings(), + BigQueryBaseService.BIGQUERY_EXCEPTION_HANDLER, + bigQueryOptions.getClock(), + retryConfig); + } catch (BigQueryRetryHelper.BigQueryRetryHelperException e) { + throw BigQueryException.translateAndThrow(e); + } + return dryRunJob; + } + + // Convert from Parameter wrapper class to positional QueryParameter generated class + private static final Function POSITIONAL_PARAMETER_TO_PB_FUNCTION = + value -> { + QueryParameter queryParameterPb = new QueryParameter(); + queryParameterPb.setParameterValue(value.getValue().toValuePb()); + queryParameterPb.setParameterType(value.getValue().toTypePb()); + return queryParameterPb; + }; + + // Convert from Parameter wrapper class to name QueryParameter generated class + private static final Function NAMED_PARAMETER_TO_PB_FUNCTION = + value -> { + QueryParameter queryParameterPb = new QueryParameter(); + queryParameterPb.setName(value.getName()); + queryParameterPb.setParameterValue(value.getValue().toValuePb()); + queryParameterPb.setParameterType(value.getValue().toTypePb()); + return queryParameterPb; + }; + + // Convert from QueryParameter class to the Parameter wrapper class + private static final Function QUERY_PARAMETER_FROM_PB_FUNCTION = + pb -> + Parameter.newBuilder() + .setName(pb.getName() == null ? "" : pb.getName()) + .setValue(QueryParameterValue.fromPb(pb.getParameterValue(), pb.getParameterType())) + .build(); +} diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/ConnectionSettings.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/ConnectionSettings.java new file mode 100644 index 000000000..ac3b1b1e0 --- /dev/null +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/ConnectionSettings.java @@ -0,0 +1,453 @@ +/* + * Copyright 2021 Google LLC + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package com.google.cloud.bigquery; + +import com.google.auto.value.AutoValue; +import com.google.cloud.bigquery.JobInfo.CreateDisposition; +import com.google.cloud.bigquery.JobInfo.SchemaUpdateOption; +import com.google.cloud.bigquery.JobInfo.WriteDisposition; +import com.google.cloud.bigquery.QueryJobConfiguration.Priority; +import java.util.List; +import java.util.Map; +import javax.annotation.Nullable; + +/** ConnectionSettings for setting up a BigQuery query connection. */ +@AutoValue +public abstract class ConnectionSettings { + ConnectionSettings() { + // Package private so users can't subclass it but AutoValue can. + } + + /** + * Returns useReadAPI flag, enabled by default. Read API will be used if the underlying conditions + * are satisfied and this flag is enabled + */ + @Nullable + public abstract Boolean getUseReadAPI(); + + /** Returns the synchronous response timeoutMs associated with this query */ + @Nullable + public abstract Long getRequestTimeout(); + + /** Returns the connection properties for connection string with this query */ + @Nullable + public abstract List getConnectionProperties(); + + /** Returns the default dataset */ + @Nullable + public abstract DatasetId getDefaultDataset(); + + /** Returns the limits the bytes billed for this job */ + @Nullable + public abstract Long getMaximumBytesBilled(); + + /** Returns the maximum number of rows of data */ + @Nullable + public abstract Long getMaxResults(); + + /** Returns the number of rows of data to pre-fetch */ + @Nullable + public abstract Integer getNumBufferedRows(); + + @Nullable + public abstract Integer getTotalToPageRowCountRatio(); + + @Nullable + public abstract Integer getMinResultSize(); + + @Nullable + public abstract Integer getMaxResultPerPage(); + + /** Returns whether to look for the result in the query cache */ + @Nullable + public abstract Boolean getUseQueryCache(); + + /** + * Returns whether nested and repeated fields should be flattened. If set to {@code false} {@link + * ConnectionSettings.Builder#setAllowLargeResults(Boolean)} must be {@code true}. + * + * @see Flatten + */ + @Nullable + public abstract Boolean getFlattenResults(); + + /** + * Returns the BigQuery Storage read API configuration @Nullable public abstract + * ReadClientConnectionConfiguration getReadClientConnectionConfiguration(); + */ + + /** + * Below properties are only supported by jobs.insert API and not yet supported by jobs.query API + * * + */ + + /** Returns the clustering specification for the destination table. */ + @Nullable + public abstract Clustering getClustering(); + + /** + * Returns whether the job is allowed to create new tables. + * + * @see + * Create Disposition + */ + @Nullable + public abstract CreateDisposition getCreateDisposition(); + + /** Returns the custom encryption configuration (e.g., Cloud KMS keys) */ + @Nullable + public abstract EncryptionConfiguration getDestinationEncryptionConfiguration(); + + /** + * Returns the table where to put query results. If not provided a new table is created. This + * value is required if {@link # allowLargeResults()} is {@code true}. + */ + @Nullable + public abstract TableId getDestinationTable(); + + /** Returns the timeout associated with this job */ + @Nullable + public abstract Long getJobTimeoutMs(); + + /** Returns the optional billing tier limit for this job. */ + @Nullable + public abstract Integer getMaximumBillingTier(); + + /** Returns the query priority. */ + @Nullable + public abstract Priority getPriority(); + + /** + * Returns whether the job is enabled to create arbitrarily large results. If {@code true} the + * query is allowed to create large results at a slight cost in performance. the query is allowed + * to create large results at a slight cost in performance. + * + * @see Returning + * Large Query Results + */ + @Nullable + public abstract Boolean getAllowLargeResults(); + + /** + * Returns whether to create a new session. + * + * @see Create Sessions + */ + @Nullable + public abstract Boolean getCreateSession(); + + /** Returns the range partitioning specification for the table */ + @Nullable + public abstract RangePartitioning getRangePartitioning(); + + /** + * [Experimental] Returns options allowing the schema of the destination table to be updated as a + * side effect of the query job. Schema update options are supported in two cases: when + * writeDisposition is WRITE_APPEND; when writeDisposition is WRITE_TRUNCATE and the destination + * table is a partition of a table, specified by partition decorators. For normal tables, + * WRITE_TRUNCATE will always overwrite the schema. + */ + @Nullable + public abstract List getSchemaUpdateOptions(); + + /** + * Returns the external tables definitions. If querying external data sources outside of BigQuery, + * this value describes the data format, location and other properties of the data sources. By + * defining these properties, the data sources can be queried as if they were standard BigQuery + * tables. + */ + @Nullable + public abstract Map getTableDefinitions(); + + /** Returns the time partitioning specification for the destination table. */ + @Nullable + public abstract TimePartitioning getTimePartitioning(); + + /** + * Returns user defined function resources that can be used by this query. Function resources can + * either be defined inline ({@link UserDefinedFunction.Type#INLINE}) or loaded from a Google + * Cloud Storage URI ({@link UserDefinedFunction.Type#FROM_URI}. + */ + @Nullable + public abstract List getUserDefinedFunctions(); + + /** + * Returns the action that should occur if the destination table already exists. + * + * @see + * Write Disposition + */ + @Nullable + public abstract WriteDisposition getWriteDisposition(); + + /** Returns a builder pre-populated using the current values of this field. */ + public abstract Builder toBuilder(); + + /** Returns a builder for a {@code ConnectionSettings} object. */ + public static Builder newBuilder() { + return new AutoValue_ConnectionSettings.Builder().withDefaultValues(); + } + + @AutoValue.Builder + public abstract static class Builder { + + Builder withDefaultValues() { + return setUseReadAPI(true) // Read API is enabled by default + .setNumBufferedRows(10000) // 10K records will be kept in the buffer (Blocking Queue) + .setMinResultSize(200000) // Read API will be enabled when there are at least 100K records + .setTotalToPageRowCountRatio(3) // there should be at least 3 pages of records + .setMaxResultPerPage(100000); // page size for pagination + } + + /** + * Sets useReadAPI flag, enabled by default. Read API will be used if the underlying conditions + * are satisfied and this flag is enabled + * + * @param useReadAPI or {@code true} for none + */ + @Nullable + public abstract Builder setUseReadAPI(Boolean useReadAPI); + + /** + * Sets how long to wait for the query to complete, in milliseconds, before the request times + * out and returns. Note that this is only a timeout for the request, not the query. If the + * query takes longer to run than the timeout value, the call returns without any results and + * with the 'jobComplete' flag set to false. You can call GetQueryResults() to wait for the + * query to complete and read the results. The default value is 10000 milliseconds (10 seconds). + * + * @param timeoutMs or {@code null} for none + */ + public abstract Builder setRequestTimeout(Long timeoutMs); + + /** + * Sets a connection-level property to customize query behavior. + * + * @param connectionProperties connectionProperties or {@code null} for none + */ + public abstract Builder setConnectionProperties(List connectionProperties); + + /** + * Sets the default dataset. This dataset is used for all unqualified table names used in the + * query. + */ + public abstract Builder setDefaultDataset(DatasetId datasetId); + + /** + * Limits the bytes billed for this job. Queries that will have bytes billed beyond this limit + * will fail (without incurring a charge). If unspecified, this will be set to your project + * default. + * + * @param maximumBytesBilled maximum bytes billed for this job + */ + public abstract Builder setMaximumBytesBilled(Long maximumBytesBilled); + + /** + * Sets the maximum number of rows of data to return per page of results. Setting this flag to a + * small value such as 1000 and then paging through results might improve reliability when the + * query result set is large. In addition to this limit, responses are also limited to 10 MB. By + * default, there is no maximum row count, and only the byte limit applies. + * + * @param maxResults maxResults or {@code null} for none + */ + public abstract Builder setMaxResults(Long maxResults); + + /** + * Sets the number of rows in the buffer (a blocking queue) that query results are consumed + * from. + * + * @param numBufferedRows numBufferedRows or {@code null} for none + */ + public abstract Builder setNumBufferedRows(Integer numBufferedRows); + + /** + * Sets a ratio of the total number of records and the records returned in the current page. + * This value is checked before calling the Read API + * + * @param totalToPageRowCountRatio totalToPageRowCountRatio + */ + public abstract Builder setTotalToPageRowCountRatio(Integer totalToPageRowCountRatio); + + /** + * Sets the minimum result size for which the Read API will be enabled + * + * @param minResultSize minResultSize + */ + public abstract Builder setMinResultSize(Integer minResultSize); + + /** + * Sets the maximum records per page to be used for pagination. This is used as an input for the + * tabledata.list and jobs.getQueryResults RPC calls + * + * @param maxResultPerPage + */ + public abstract Builder setMaxResultPerPage(Integer maxResultPerPage); + + /** + * Sets whether to look for the result in the query cache. The query cache is a best-effort + * cache that will be flushed whenever tables in the query are modified. Moreover, the query + * cache is only available when {@link ConnectionSettings.Builder#setDestinationTable(TableId)} + * is not set. + * + * @see Query Caching + */ + public abstract Builder setUseQueryCache(Boolean useQueryCache); + + /** + * Sets whether nested and repeated fields should be flattened. If set to {@code false} {@link + * ConnectionSettings.Builder#setAllowLargeResults(Boolean)} must be {@code true}. By default + * results are flattened. + * + * @see Flatten + */ + public abstract Builder setFlattenResults(Boolean flattenResults); + + /* */ + /**/ + /** + * Sets the values necessary to determine whether table result will be read using the BigQuery + * Storage client Read API. The BigQuery Storage client Read API will be used to read the query + * result when the totalToFirstPageSizeRatio (default 3) and minimumTableSize (default 100 rows) + * conditions set are met. A ReadSession will be created using the Apache Arrow data format for + * serialization. + * + *

It also sets the maximum number of table rows allowed in buffer before streaming them to + * the BigQueryResult. + * + * @param readClientConnectionConfiguration or {@code null} for none + */ + /* + public abstract Builder setReadClientConnectionConfiguration( + ReadClientConnectionConfiguration readClientConnectionConfiguration);*/ + + /** Sets the clustering specification for the destination table. */ + public abstract Builder setClustering(Clustering clustering); + + /** + * Sets whether the job is allowed to create tables. + * + * @see + * Create Disposition + */ + public abstract Builder setCreateDisposition(CreateDisposition createDisposition); + + /** + * Sets the custom encryption configuration (e.g., Cloud KMS keys). + * + * @param destinationEncryptionConfiguration destinationEncryptionConfiguration or {@code null} + * for none + */ + public abstract Builder setDestinationEncryptionConfiguration( + EncryptionConfiguration destinationEncryptionConfiguration); + + /** + * Sets the table where to put query results. If not provided a new table is created. This value + * is required if {@link ConnectionSettings.Builder#setAllowLargeResults(Boolean)} is set to + * {@code true}. + */ + public abstract Builder setDestinationTable(TableId destinationTable); + + /** + * [Optional] Job timeout in milliseconds. If this time limit is exceeded, BigQuery may attempt + * to terminate the job. + * + * @param jobTimeoutMs jobTimeoutMs or {@code null} for none + */ + public abstract Builder setJobTimeoutMs(Long jobTimeoutMs); + + /** + * Limits the billing tier for this job. Queries that have resource usage beyond this tier will + * fail (without incurring a charge). If unspecified, this will be set to your project default. + * + * @param maximumBillingTier maximum billing tier for this job + */ + public abstract Builder setMaximumBillingTier(Integer maximumBillingTier); + + /** + * Sets a priority for the query. If not specified the priority is assumed to be {@link + * Priority#INTERACTIVE}. + */ + public abstract Builder setPriority(Priority priority); + + /** + * Sets whether the job is enabled to create arbitrarily large results. If {@code true} the + * query is allowed to create large results at a slight cost in performance. If {@code true} + * {@link ConnectionSettings.Builder#setDestinationTable(TableId)} must be provided. + * + * @see Returning + * Large Query Results + */ + public abstract Builder setAllowLargeResults(Boolean allowLargeResults); + + /** + * Sets whether to create a new session. If {@code true} a random session id will be generated + * by BigQuery. If false, runs query with an existing session_id passed in ConnectionProperty, + * otherwise runs query in non-session mode." + */ + public abstract Builder setCreateSession(Boolean createSession); + + /** + * Range partitioning specification for this table. Only one of timePartitioning and + * rangePartitioning should be specified. + * + * @param rangePartitioning rangePartitioning or {@code null} for none + */ + public abstract Builder setRangePartitioning(RangePartitioning rangePartitioning); + + /** + * [Experimental] Sets options allowing the schema of the destination table to be updated as a + * side effect of the query job. Schema update options are supported in two cases: when + * writeDisposition is WRITE_APPEND; when writeDisposition is WRITE_TRUNCATE and the destination + * table is a partition of a table, specified by partition decorators. For normal tables, + * WRITE_TRUNCATE will always overwrite the schema. + */ + public abstract Builder setSchemaUpdateOptions(List schemaUpdateOptions); + + /** + * Sets the external tables definitions. If querying external data sources outside of BigQuery, + * this value describes the data format, location and other properties of the data sources. By + * defining these properties, the data sources can be queried as if they were standard BigQuery + * tables. + */ + public abstract Builder setTableDefinitions( + Map tableDefinitions); + + /** Sets the time partitioning specification for the destination table. */ + public abstract Builder setTimePartitioning(TimePartitioning timePartitioning); + + /** + * Sets user defined function resources that can be used by this query. Function resources can + * either be defined inline ({@link UserDefinedFunction#inline(String)}) or loaded from a Google + * Cloud Storage URI ({@link UserDefinedFunction#fromUri(String)}. + */ + public abstract Builder setUserDefinedFunctions(List userDefinedFunctions); + + /** + * Sets the action that should occur if the destination table already exists. + * + * @see + * Write Disposition + */ + public abstract Builder setWriteDisposition(WriteDisposition writeDisposition); + + /** Creates a {@code ConnectionSettings} object. */ + public abstract ConnectionSettings build(); + } +} diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/JobStatistics.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/JobStatistics.java index ab9fdabb3..0ef1d1f94 100644 --- a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/JobStatistics.java +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/JobStatistics.java @@ -21,6 +21,7 @@ import com.google.api.services.bigquery.model.JobStatistics2; import com.google.api.services.bigquery.model.JobStatistics3; import com.google.api.services.bigquery.model.JobStatistics4; +import com.google.api.services.bigquery.model.QueryParameter; import com.google.cloud.StringEnumType; import com.google.cloud.StringEnumValue; import com.google.common.base.Function; @@ -339,6 +340,7 @@ public static class QueryStatistics extends JobStatistics { private final List queryPlan; private final List timeline; private final Schema schema; + private final List queryParameters; /** * StatementType represents possible types of SQL statements reported as part of the @@ -421,6 +423,7 @@ static final class Builder extends JobStatistics.Builder queryPlan; private List timeline; private Schema schema; + private List queryParameters; private Builder() {} @@ -569,6 +572,11 @@ Builder setSchema(Schema schema) { return self(); } + Builder setQueryParameters(List queryParameters) { + this.queryParameters = queryParameters; + return self(); + } + @Override QueryStatistics build() { return new QueryStatistics(this); @@ -595,6 +603,7 @@ private QueryStatistics(Builder builder) { this.queryPlan = builder.queryPlan; this.timeline = builder.timeline; this.schema = builder.schema; + this.queryParameters = builder.queryParameters; } /** Returns query statistics specific to the use of BI Engine. */ @@ -715,6 +724,14 @@ public Schema getSchema() { return schema; } + /** + * Standard SQL only: Returns a list of undeclared query parameters detected during a dry run + * validation. + */ + public List getQueryParameters() { + return queryParameters; + } + @Override ToStringHelper toStringHelper() { return super.toStringHelper() @@ -725,7 +742,8 @@ ToStringHelper toStringHelper() { .add("totalBytesProcessed", totalBytesProcessed) .add("queryPlan", queryPlan) .add("timeline", timeline) - .add("schema", schema); + .add("schema", schema) + .add("queryParameters", queryParameters); } @Override @@ -746,7 +764,8 @@ public final int hashCode() { totalBytesBilled, totalBytesProcessed, queryPlan, - schema); + schema, + queryParameters); } @Override @@ -788,6 +807,9 @@ com.google.api.services.bigquery.model.JobStatistics toPb() { if (schema != null) { queryStatisticsPb.setSchema(schema.toPb()); } + if (queryParameters != null) { + queryStatisticsPb.setUndeclaredQueryParameters(queryParameters); + } return super.toPb().setQuery(queryStatisticsPb); } diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/Parameter.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/Parameter.java new file mode 100644 index 000000000..9959feab9 --- /dev/null +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/Parameter.java @@ -0,0 +1,70 @@ +/* + * Copyright 2022 Google LLC + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package com.google.cloud.bigquery; + +import com.google.auto.value.AutoValue; +import javax.annotation.Nullable; + +/* Wrapper class for query parameters */ +@AutoValue +public abstract class Parameter { + Parameter() { + // Package private so users can't subclass it but AutoValue can. + } + + /** + * Returns the name of the query parameter. If unset, this is a positional parameter. Otherwise, + * should be unique within a query. + * + * @return value or {@code null} for none + */ + @Nullable + public abstract String getName(); + + /** Returns the value for a query parameter along with its type. */ + public abstract QueryParameterValue getValue(); + + /** Returns a builder pre-populated using the current values of this field. */ + public abstract Builder toBuilder(); + + /** Returns a builder for a {@code Parameter} object. */ + public static Builder newBuilder() { + return new AutoValue_Parameter.Builder(); + } + + @AutoValue.Builder + public abstract static class Builder { + + /** + * [Optional] Sets the name of the query parameter. If unset, this is a positional parameter. + * Otherwise, should be unique within a query. + * + * @param name name or {@code null} for none + */ + public abstract Builder setName(String name); + + /** + * Sets the the value for a query parameter along with its type. + * + * @param parameter parameter or {@code null} for none + */ + public abstract Builder setValue(QueryParameterValue parameter); + + /** Creates a {@code Parameter} object. */ + public abstract Parameter build(); + } +} diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/QueryJobConfiguration.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/QueryJobConfiguration.java index 48ec22caf..cc726bdd1 100644 --- a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/QueryJobConfiguration.java +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/QueryJobConfiguration.java @@ -166,6 +166,11 @@ private Builder(com.google.api.services.bigquery.model.JobConfiguration configur this(); JobConfigurationQuery queryConfigurationPb = configurationPb.getQuery(); this.query = queryConfigurationPb.getQuery(); + // Allows to get undeclaredqueryparameters in jobstatistics2 + if (queryConfigurationPb.getQueryParameters() == null + && queryConfigurationPb.getParameterMode() != null) { + parameterMode = queryConfigurationPb.getParameterMode(); + } if (queryConfigurationPb.getQueryParameters() != null && !queryConfigurationPb.getQueryParameters().isEmpty()) { if (queryConfigurationPb.getQueryParameters().get(0).getName() == null) { diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/ReadClientConnectionConfiguration.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/ReadClientConnectionConfiguration.java new file mode 100644 index 000000000..e0805a11e --- /dev/null +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/ReadClientConnectionConfiguration.java @@ -0,0 +1,70 @@ +/* + * Copyright 2021 Google LLC + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package com.google.cloud.bigquery; + +import com.google.auto.value.AutoValue; +import java.io.Serializable; +import javax.annotation.Nullable; + +/** Represents BigQueryStorage Read client connection information. */ +@AutoValue +public abstract class ReadClientConnectionConfiguration implements Serializable { + + @AutoValue.Builder + public abstract static class Builder { + + /** + * Sets the total row count to page row count ratio used to determine whether to us the + * BigQueryStorage Read client to fetch result sets after the first page. + */ + @Nullable + public abstract Builder setTotalToPageRowCountRatio(Long ratio); + + /** + * Sets the minimum number of table rows in the query results used to determine whether to us + * the BigQueryStorage Read client to fetch result sets after the first page. + */ + @Nullable + public abstract Builder setMinResultSize(Long numRows); + + /** + * Sets the maximum number of table rows allowed in buffer before streaming them to the + * BigQueryResult. + */ + @Nullable + public abstract Builder setBufferSize(Long bufferSize); + + /** Creates a {@code ReadClientConnectionConfiguration} object. */ + public abstract ReadClientConnectionConfiguration build(); + } + + /** Returns the totalToPageRowCountRatio in this configuration. */ + public abstract Long getTotalToPageRowCountRatio(); + + /** Returns the minResultSize in this configuration. */ + public abstract Long getMinResultSize(); + + /** Returns the bufferSize in this configuration. */ + public abstract Long getBufferSize(); + + public abstract Builder toBuilder(); + + /** Returns a builder for a {@code ReadClientConnectionConfiguration} object. */ + public static Builder newBuilder() { + return new AutoValue_ReadClientConnectionConfiguration.Builder(); + } +} diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/spi/v2/BigQueryRpc.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/spi/v2/BigQueryRpc.java index 06488c5b4..871590ca4 100644 --- a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/spi/v2/BigQueryRpc.java +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/spi/v2/BigQueryRpc.java @@ -122,6 +122,13 @@ Boolean getBoolean(Map options) { */ Job create(Job job, Map options); + /** + * Creates a new query job. + * + * @throws BigQueryException upon failure + */ + Job createJobForQuery(Job job); + /** * Delete the requested dataset. * @@ -246,6 +253,14 @@ TableDataInsertAllResponse insertAll( TableDataList listTableData( String projectId, String datasetId, String tableId, Map options); + /** + * Lists the table's rows with a limit on how many rows of data to pre-fetch. + * + * @throws BigQueryException upon failure + */ + TableDataList listTableDataWithRowLimit( + String projectId, String datasetId, String tableId, Integer rowLimit, String pageToken); + /** * Returns the requested job or {@code null} if not found. * @@ -253,6 +268,13 @@ TableDataList listTableData( */ Job getJob(String projectId, String jobId, String location, Map options); + /** + * Returns the requested query job or {@code null} if not found. + * + * @throws BigQueryException upon failure + */ + Job getQueryJob(String projectId, String jobId, String location); + /** * Lists the project's jobs. * @@ -286,6 +308,15 @@ TableDataList listTableData( GetQueryResultsResponse getQueryResults( String projectId, String jobId, String location, Map options); + /** + * Returns results of the query with a limit on how many rows of data to pre-fetch associated with + * the provided job. + * + * @throws BigQueryException upon failure + */ + GetQueryResultsResponse getQueryResultsWithRowLimit( + String projectId, String jobId, String location, Integer preFetchedRowLimit); + /** * Runs a BigQuery SQL query synchronously and returns query results if the query completes within * a specified timeout. diff --git a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/spi/v2/HttpBigQueryRpc.java b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/spi/v2/HttpBigQueryRpc.java index 24d7dd6b0..d6b57a3da 100644 --- a/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/spi/v2/HttpBigQueryRpc.java +++ b/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/spi/v2/HttpBigQueryRpc.java @@ -221,6 +221,19 @@ public Job create(Job job, Map options) { } } + @Override + public Job createJobForQuery(Job job) { + try { + String projectId = + job.getJobReference() != null + ? job.getJobReference().getProjectId() + : this.options.getProjectId(); + return bigquery.jobs().insert(projectId, job).setPrettyPrint(false).execute(); + } catch (IOException ex) { + throw translate(ex); + } + } + @Override public boolean deleteDataset(String projectId, String datasetId, Map options) { try { @@ -515,6 +528,26 @@ public TableDataList listTableData( } } + @Override + public TableDataList listTableDataWithRowLimit( + String projectId, + String datasetId, + String tableId, + Integer maxResultPerPage, + String pageToken) { + try { + return bigquery + .tabledata() + .list(projectId, datasetId, tableId) + .setPrettyPrint(false) + .setMaxResults(Long.valueOf(maxResultPerPage)) + .setPageToken(pageToken) + .execute(); + } catch (IOException ex) { + throw translate(ex); + } + } + @Override public Job getJob(String projectId, String jobId, String location, Map options) { try { @@ -534,6 +567,24 @@ public Job getJob(String projectId, String jobId, String location, Map> listJobs(String projectId, Map options) { try { @@ -644,6 +695,22 @@ public GetQueryResultsResponse getQueryResults( } } + @Override + public GetQueryResultsResponse getQueryResultsWithRowLimit( + String projectId, String jobId, String location, Integer maxResultPerPage) { + try { + return bigquery + .jobs() + .getQueryResults(projectId, jobId) + .setPrettyPrint(false) + .setLocation(location) + .setMaxResults(Long.valueOf(maxResultPerPage)) + .execute(); + } catch (IOException ex) { + throw translate(ex); + } + } + @Override public QueryResponse queryRpc(String projectId, QueryRequest content) { try { diff --git a/google-cloud-bigquery/src/test/java/com/google/cloud/bigquery/ConnectionImplTest.java b/google-cloud-bigquery/src/test/java/com/google/cloud/bigquery/ConnectionImplTest.java new file mode 100644 index 000000000..e4fdc9731 --- /dev/null +++ b/google-cloud-bigquery/src/test/java/com/google/cloud/bigquery/ConnectionImplTest.java @@ -0,0 +1,542 @@ +/* + * Copyright 2021 Google LLC + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package com.google.cloud.bigquery; + +import static org.junit.Assert.*; +import static org.junit.Assert.assertEquals; +import static org.mockito.ArgumentMatchers.any; +import static org.mockito.Mockito.*; +import static org.mockito.Mockito.verify; + +import com.google.api.services.bigquery.model.*; +import com.google.api.services.bigquery.model.QueryResponse; +import com.google.cloud.ServiceOptions; +import com.google.cloud.Tuple; +import com.google.cloud.bigquery.spi.BigQueryRpcFactory; +import com.google.cloud.bigquery.spi.v2.BigQueryRpc; +import com.google.common.collect.ImmutableList; +import java.math.BigInteger; +import java.sql.SQLException; +import java.util.AbstractList; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.concurrent.BlockingQueue; +import java.util.concurrent.LinkedBlockingDeque; +import org.junit.Before; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mockito; +import org.mockito.junit.MockitoJUnitRunner; + +@RunWith(MockitoJUnitRunner.class) +public class ConnectionImplTest { + private BigQueryOptions options; + private BigQueryRpcFactory rpcFactoryMock; + private BigQueryRpc bigqueryRpcMock; + private Connection connectionMock; + private BigQuery bigquery; + private ConnectionImpl connection; + private static final String PROJECT = "project"; + private static final String JOB = "job"; + private static final String LOCATION = "US"; + private static final String DEFAULT_TEST_DATASET = "bigquery_test_dataset"; + private static final String PAGE_TOKEN = "ABCD123"; + private static final TableId TABLE_NAME = TableId.of(DEFAULT_TEST_DATASET, PROJECT); + private static final TableCell STRING_CELL = new TableCell().setV("Value"); + private static final TableRow TABLE_ROW = new TableRow().setF(ImmutableList.of(STRING_CELL)); + private static final String SQL_QUERY = + "SELECT county, state_name FROM bigquery_test_dataset.large_data_testing_table limit 2"; + private static final String DRY_RUN_SQL = + "SELECT county, state_name FROM bigquery_test_dataset.large_data_testing_table where country = ?"; + private static final int DEFAULT_PAGE_SIZE = 10000; + private ConnectionSettings connectionSettings; + private static final Schema QUERY_SCHEMA = + Schema.of( + Field.newBuilder("country", StandardSQLTypeName.STRING) + .setMode(Field.Mode.NULLABLE) + .build(), + Field.newBuilder("state_name", StandardSQLTypeName.STRING) + .setMode(Field.Mode.NULLABLE) + .build()); + private static final TableSchema FAST_QUERY_TABLESCHEMA = QUERY_SCHEMA.toPb(); + private static final BigQueryResult BQ_RS_MOCK_RES = + new BigQueryResultImpl(QUERY_SCHEMA, 2, null, null); + + private static final BigQueryResult BQ_RS_MOCK_RES_MULTI_PAGE = + new BigQueryResultImpl(QUERY_SCHEMA, 4, null, null); + + private static final JobId QUERY_JOB = JobId.of(PROJECT, JOB).setLocation(LOCATION); + private static final GetQueryResultsResponse GET_QUERY_RESULTS_RESPONSE = + new GetQueryResultsResponse() + .setJobReference(QUERY_JOB.toPb()) + .setRows(ImmutableList.of(TABLE_ROW)) + .setJobComplete(true) + .setCacheHit(false) + .setPageToken(PAGE_TOKEN) + .setTotalBytesProcessed(42L) + .setTotalRows(BigInteger.valueOf(1L)) + .setSchema(FAST_QUERY_TABLESCHEMA); + + private BigQueryOptions createBigQueryOptionsForProject( + String project, BigQueryRpcFactory rpcFactory) { + return BigQueryOptions.newBuilder() + .setProjectId(project) + .setServiceRpcFactory(rpcFactory) + .setRetrySettings(ServiceOptions.getNoRetrySettings()) + .build(); + } + + @Before + public void setUp() { + rpcFactoryMock = mock(BigQueryRpcFactory.class); + bigqueryRpcMock = mock(BigQueryRpc.class); + connectionMock = mock(Connection.class); + when(rpcFactoryMock.create(any(BigQueryOptions.class))).thenReturn(bigqueryRpcMock); + options = createBigQueryOptionsForProject(PROJECT, rpcFactoryMock); + bigquery = options.getService(); + + connectionSettings = + ConnectionSettings.newBuilder() + .setDefaultDataset(DatasetId.of(DEFAULT_TEST_DATASET)) + .setNumBufferedRows(DEFAULT_PAGE_SIZE) + .build(); + bigquery = + options + .toBuilder() + .setRetrySettings(ServiceOptions.getDefaultRetrySettings()) + .build() + .getService(); + connection = (ConnectionImpl) bigquery.createConnection(connectionSettings); + assertNotNull(connection); + } + + @Test + public void testFastQuerySinglePage() throws BigQuerySQLException { + com.google.api.services.bigquery.model.QueryResponse mockQueryRes = + new QueryResponse().setSchema(FAST_QUERY_TABLESCHEMA).setJobComplete(true); + when(bigqueryRpcMock.queryRpc(any(String.class), any(QueryRequest.class))) + .thenReturn(mockQueryRes); + ConnectionImpl connectionSpy = Mockito.spy(connection); + doReturn(BQ_RS_MOCK_RES) + .when(connectionSpy) + .processQueryResponseResults(any(QueryResponse.class)); + + BigQueryResult res = connectionSpy.executeSelect(SQL_QUERY); + assertEquals(res.getTotalRows(), 2); + assertEquals(QUERY_SCHEMA, res.getSchema()); + verify(connectionSpy, times(1)) + .processQueryResponseResults( + any(com.google.api.services.bigquery.model.QueryResponse.class)); + } + + @Test + // NOTE: This doesn't truly paginates. Returns a response while mocking + // processQueryResponseResults + public void testFastQueryMultiplePages() throws BigQuerySQLException { + com.google.api.services.bigquery.model.QueryResponse mockQueryRes = + new QueryResponse() + .setSchema(FAST_QUERY_TABLESCHEMA) + .setJobComplete(true) + .setPageToken(PAGE_TOKEN); + when(bigqueryRpcMock.queryRpc(any(String.class), any(QueryRequest.class))) + .thenReturn(mockQueryRes); + ConnectionImpl connectionSpy = Mockito.spy(connection); + + doReturn(BQ_RS_MOCK_RES_MULTI_PAGE) + .when(connectionSpy) + .processQueryResponseResults( + any(com.google.api.services.bigquery.model.QueryResponse.class)); + + BigQueryResult res = connectionSpy.executeSelect(SQL_QUERY); + assertEquals(res.getTotalRows(), 4); + assertEquals(QUERY_SCHEMA, res.getSchema()); + verify(connectionSpy, times(1)) + .processQueryResponseResults( + any(com.google.api.services.bigquery.model.QueryResponse.class)); + } + + @Test + public void testClose() throws BigQuerySQLException { + boolean cancelled = connection.close(); + assertTrue(cancelled); + } + + @Test + public void testQueryDryRun() throws BigQuerySQLException { + List queryParametersMock = + ImmutableList.of( + new QueryParameter().setParameterType(new QueryParameterType().setType("STRING"))); + com.google.api.services.bigquery.model.JobStatistics2 queryMock = + new com.google.api.services.bigquery.model.JobStatistics2() + .setSchema(FAST_QUERY_TABLESCHEMA) + .setUndeclaredQueryParameters(queryParametersMock); + com.google.api.services.bigquery.model.JobStatistics jobStatsMock = + new com.google.api.services.bigquery.model.JobStatistics() + .setCreationTime(1234L) + .setStartTime(5678L) + .setQuery(queryMock); + com.google.api.services.bigquery.model.JobConfigurationQuery jobConfigurationQuery = + new com.google.api.services.bigquery.model.JobConfigurationQuery(); + com.google.api.services.bigquery.model.JobConfiguration jobConfig = + new com.google.api.services.bigquery.model.JobConfiguration() + .setQuery(jobConfigurationQuery); + com.google.api.services.bigquery.model.Job mockDryRunJob = + new com.google.api.services.bigquery.model.Job() + .setStatistics(jobStatsMock) + .setConfiguration(jobConfig); + when(bigqueryRpcMock.createJobForQuery(any(com.google.api.services.bigquery.model.Job.class))) + .thenReturn(mockDryRunJob); + BigQueryDryRunResult dryRunResult = connection.dryRun(DRY_RUN_SQL); + assertEquals(1, dryRunResult.getQueryParameters().size()); + assertEquals(QUERY_SCHEMA, dryRunResult.getSchema()); + verify(bigqueryRpcMock, times(1)) + .createJobForQuery(any(com.google.api.services.bigquery.model.Job.class)); + } + + @Test + public void testParseDataTask() throws InterruptedException { + List tableRows = + ImmutableList.of( + new TableRow() + .setF( + ImmutableList.of( + new TableCell().setV("Value1"), new TableCell().setV("Value2"))), + new TableRow() + .setF( + ImmutableList.of( + new TableCell().setV("Value3"), new TableCell().setV("Value4")))); + + BlockingQueue, Boolean>> pageCache = + new LinkedBlockingDeque<>(2); + BlockingQueue> rpcResponseQueue = new LinkedBlockingDeque<>(2); + rpcResponseQueue.offer(Tuple.of(null, false)); + // This call should populate page cache + ConnectionImpl connectionSpy = Mockito.spy(connection); + connectionSpy.parseRpcDataAsync(tableRows, QUERY_SCHEMA, pageCache, rpcResponseQueue); + Tuple, Boolean> fvlTupple = + pageCache.take(); // wait for the parser thread to parse the data + assertNotNull(fvlTupple); + Iterable iterableFvl = fvlTupple.x(); + int rowCnt = 0; + for (FieldValueList fvl : iterableFvl) { + assertEquals(2, fvl.size()); // both the rows should have 2 fields each + rowCnt++; + } + assertEquals(2, rowCnt); // row rows read + + verify(connectionSpy, times(1)) + .parseRpcDataAsync( + any(List.class), any(Schema.class), any(BlockingQueue.class), any(BlockingQueue.class)); + } + + @Test + public void testPopulateBuffer() throws InterruptedException { + List tableRows = + ImmutableList.of( + new TableRow() + .setF( + ImmutableList.of( + new TableCell().setV("Value1"), new TableCell().setV("Value2"))), + new TableRow() + .setF( + ImmutableList.of( + new TableCell().setV("Value3"), new TableCell().setV("Value4")))); + + BlockingQueue, Boolean>> pageCache = + new LinkedBlockingDeque<>(2); + BlockingQueue> rpcResponseQueue = new LinkedBlockingDeque<>(2); + BlockingQueue> buffer = new LinkedBlockingDeque<>(5); + rpcResponseQueue.offer(Tuple.of(null, false)); + // This call should populate page cache + ConnectionImpl connectionSpy = Mockito.spy(connection); + + connectionSpy.parseRpcDataAsync(tableRows, QUERY_SCHEMA, pageCache, rpcResponseQueue); + + verify(connectionSpy, times(1)) + .parseRpcDataAsync( + any(List.class), any(Schema.class), any(BlockingQueue.class), any(BlockingQueue.class)); + + // now pass the pageCache to populateBuffer method + connectionSpy.populateBufferAsync(rpcResponseQueue, pageCache, buffer); + // check if buffer was populated with two rows async by using the blocking take method + AbstractList fvl1 = buffer.take(); + assertNotNull(fvl1); + assertEquals(2, fvl1.size()); + assertEquals("Value1", fvl1.get(0).getValue().toString()); + assertEquals("Value2", fvl1.get(1).getValue().toString()); + AbstractList fvl2 = buffer.take(); + assertNotNull(fvl2); + assertEquals(2, fvl2.size()); + assertEquals("Value3", fvl2.get(0).getValue().toString()); + assertEquals("Value4", fvl2.get(1).getValue().toString()); + verify(connectionSpy, times(1)) + .populateBufferAsync( + any(BlockingQueue.class), any(BlockingQueue.class), any(BlockingQueue.class)); + } + + @Test + public void testNextPageTask() throws InterruptedException { + BlockingQueue> rpcResponseQueue = new LinkedBlockingDeque<>(2); + TableDataList mockTabledataList = + new TableDataList() + .setPageToken(PAGE_TOKEN) + .setRows(ImmutableList.of(TABLE_ROW)) + .setTotalRows(1L); + ConnectionImpl connectionSpy = Mockito.spy(connection); + doReturn(mockTabledataList) + .when(connectionSpy) + .tableDataListRpc(any(TableId.class), any(String.class)); + connectionSpy.runNextPageTaskAsync(PAGE_TOKEN, TABLE_NAME, rpcResponseQueue); + Tuple tableDataListTuple = rpcResponseQueue.take(); + assertNotNull(tableDataListTuple); + TableDataList tableDataList = tableDataListTuple.x(); + assertNotNull(tableDataList); + assertEquals("ABCD123", tableDataList.getPageToken()); + assertEquals(Long.valueOf(1), tableDataList.getTotalRows()); + verify(connectionSpy, times(1)) + .runNextPageTaskAsync(any(String.class), any(TableId.class), any(BlockingQueue.class)); + } + + @Test + public void testGetQueryResultsFirstPage() { + when(bigqueryRpcMock.getQueryResultsWithRowLimit( + any(String.class), any(String.class), any(String.class), any(Integer.class))) + .thenReturn(GET_QUERY_RESULTS_RESPONSE); + GetQueryResultsResponse response = connection.getQueryResultsFirstPage(QUERY_JOB); + assertNotNull(response); + assertEquals(GET_QUERY_RESULTS_RESPONSE, response); + verify(bigqueryRpcMock, times(1)) + .getQueryResultsWithRowLimit( + any(String.class), any(String.class), any(String.class), any(Integer.class)); + } + + // calls executeSelect with a nonFast query and exercises createQueryJob + @Test + public void testLegacyQuerySinglePage() throws BigQuerySQLException { + ConnectionImpl connectionSpy = Mockito.spy(connection); + com.google.api.services.bigquery.model.Job jobResponseMock = + new com.google.api.services.bigquery.model.Job() + // .setConfiguration(QUERY_JOB.g) + .setJobReference(QUERY_JOB.toPb()) + .setId(JOB) + .setStatus(new com.google.api.services.bigquery.model.JobStatus().setState("DONE")); + // emulating a legacy query + doReturn(false).when(connectionSpy).isFastQuerySupported(); + doReturn(GET_QUERY_RESULTS_RESPONSE) + .when(connectionSpy) + .getQueryResultsFirstPage(any(JobId.class)); + doReturn(BQ_RS_MOCK_RES) + .when(connectionSpy) + .getSubsequentQueryResultsWithJob( + any(Long.class), + any(Long.class), + any(JobId.class), + any(GetQueryResultsResponse.class), + any(Boolean.class)); + when(bigqueryRpcMock.createJobForQuery(any(com.google.api.services.bigquery.model.Job.class))) + .thenReturn(jobResponseMock); // RPC call in createQueryJob + BigQueryResult res = connectionSpy.executeSelect(SQL_QUERY); + assertEquals(res.getTotalRows(), 2); + assertEquals(QUERY_SCHEMA, res.getSchema()); + verify(bigqueryRpcMock, times(1)) + .createJobForQuery(any(com.google.api.services.bigquery.model.Job.class)); + } + + // exercises getSubsequentQueryResultsWithJob for fast running queries + @Test + public void testFastQueryLongRunning() throws SQLException { + List tableRows = + ImmutableList.of( + new TableRow() + .setF( + ImmutableList.of( + new TableCell().setV("Value1"), new TableCell().setV("Value2"))), + new TableRow() + .setF( + ImmutableList.of( + new TableCell().setV("Value3"), new TableCell().setV("Value4")))); + ConnectionImpl connectionSpy = Mockito.spy(connection); + // emulating a fast query + doReturn(true).when(connectionSpy).isFastQuerySupported(); + doReturn(GET_QUERY_RESULTS_RESPONSE) + .when(connectionSpy) + .getQueryResultsFirstPage(any(JobId.class)); + + doReturn(TABLE_NAME).when(connectionSpy).getDestinationTable(any(JobId.class)); + doReturn(BQ_RS_MOCK_RES) + .when(connectionSpy) + .tableDataList(any(GetQueryResultsResponse.class), any(JobId.class)); + + com.google.api.services.bigquery.model.QueryResponse mockQueryRes = + new QueryResponse() + .setSchema(FAST_QUERY_TABLESCHEMA) + .setJobComplete(false) + .setTotalRows(new BigInteger(String.valueOf(4L))) + .setJobReference(QUERY_JOB.toPb()) + .setRows(tableRows); + when(bigqueryRpcMock.queryRpc(any(String.class), any(QueryRequest.class))) + .thenReturn(mockQueryRes); + BigQueryResult res = connectionSpy.executeSelect(SQL_QUERY); + assertEquals(res.getTotalRows(), 2); + assertEquals(QUERY_SCHEMA, res.getSchema()); + verify(bigqueryRpcMock, times(1)).queryRpc(any(String.class), any(QueryRequest.class)); + } + + @Test + // Emulates first page response using getQueryResultsFirstPage(jobId) and then subsequent pages + // using getQueryResultsFirstPage(jobId) getSubsequentQueryResultsWithJob( + public void testLegacyQueryMultiplePages() throws SQLException { + ConnectionImpl connectionSpy = Mockito.spy(connection); + com.google.api.services.bigquery.model.JobStatistics jobStatistics = + new com.google.api.services.bigquery.model.JobStatistics(); + // emulating a Legacy query + doReturn(false).when(connectionSpy).isFastQuerySupported(); + doReturn(GET_QUERY_RESULTS_RESPONSE) + .when(connectionSpy) + .getQueryResultsFirstPage(any(JobId.class)); + doReturn(TABLE_NAME).when(connectionSpy).getDestinationTable(any(JobId.class)); + doReturn(BQ_RS_MOCK_RES) + .when(connectionSpy) + .tableDataList(any(GetQueryResultsResponse.class), any(JobId.class)); + com.google.api.services.bigquery.model.Job jobResponseMock = + new com.google.api.services.bigquery.model.Job() + .setJobReference(QUERY_JOB.toPb()) + .setId(JOB) + .setStatus(new com.google.api.services.bigquery.model.JobStatus().setState("DONE")) + .setStatistics(jobStatistics); + when(bigqueryRpcMock.createJobForQuery(any(com.google.api.services.bigquery.model.Job.class))) + .thenReturn(jobResponseMock); // RPC call in createQueryJob + BigQueryResult res = connectionSpy.executeSelect(SQL_QUERY); + assertEquals(res.getTotalRows(), 2); + assertEquals(QUERY_SCHEMA, res.getSchema()); + verify(bigqueryRpcMock, times(1)) + .createJobForQuery(any(com.google.api.services.bigquery.model.Job.class)); + verify(connectionSpy, times(1)) + .tableDataList(any(GetQueryResultsResponse.class), any(JobId.class)); + } + + @Test + public void testExecuteSelectSlow() throws BigQuerySQLException { + ConnectionImpl connectionSpy = Mockito.spy(connection); + doReturn(false).when(connectionSpy).isFastQuerySupported(); + com.google.api.services.bigquery.model.JobStatistics jobStatistics = + new com.google.api.services.bigquery.model.JobStatistics(); + com.google.api.services.bigquery.model.Job jobResponseMock = + new com.google.api.services.bigquery.model.Job() + .setJobReference(QUERY_JOB.toPb()) + .setId(JOB) + .setStatus(new com.google.api.services.bigquery.model.JobStatus().setState("DONE")) + .setStatistics(jobStatistics); + + doReturn(jobResponseMock) + .when(connectionSpy) + .createQueryJob(SQL_QUERY, connectionSettings, null, null); + doReturn(GET_QUERY_RESULTS_RESPONSE) + .when(connectionSpy) + .getQueryResultsFirstPage(any(JobId.class)); + doReturn(BQ_RS_MOCK_RES) + .when(connectionSpy) + .getResultSet( + any(GetQueryResultsResponse.class), + any(JobId.class), + any(String.class), + any(Boolean.class)); + BigQueryResult res = connectionSpy.executeSelect(SQL_QUERY); + assertEquals(res.getTotalRows(), 2); + assertEquals(QUERY_SCHEMA, res.getSchema()); + verify(connectionSpy, times(1)) + .getResultSet( + any(GetQueryResultsResponse.class), + any(JobId.class), + any(String.class), + any(Boolean.class)); + } + + @Test + public void testExecuteSelectSlowWithParams() throws BigQuerySQLException { + ConnectionImpl connectionSpy = Mockito.spy(connection); + List parameters = new ArrayList<>(); + Map labels = new HashMap<>(); + doReturn(false).when(connectionSpy).isFastQuerySupported(); + com.google.api.services.bigquery.model.JobStatistics jobStatistics = + new com.google.api.services.bigquery.model.JobStatistics(); + com.google.api.services.bigquery.model.Job jobResponseMock = + new com.google.api.services.bigquery.model.Job() + .setJobReference(QUERY_JOB.toPb()) + .setId(JOB) + .setStatus(new com.google.api.services.bigquery.model.JobStatus().setState("DONE")) + .setStatistics(jobStatistics); + + doReturn(jobResponseMock) + .when(connectionSpy) + .createQueryJob(SQL_QUERY, connectionSettings, parameters, labels); + doReturn(GET_QUERY_RESULTS_RESPONSE) + .when(connectionSpy) + .getQueryResultsFirstPage(any(JobId.class)); + doReturn(BQ_RS_MOCK_RES) + .when(connectionSpy) + .getResultSet( + any(GetQueryResultsResponse.class), + any(JobId.class), + any(String.class), + any(Boolean.class)); + BigQueryResult res = connectionSpy.executeSelect(SQL_QUERY, parameters, labels); + assertEquals(res.getTotalRows(), 2); + assertEquals(QUERY_SCHEMA, res.getSchema()); + verify(connectionSpy, times(1)) + .getResultSet( + any(GetQueryResultsResponse.class), + any(JobId.class), + any(String.class), + any(Boolean.class)); + } + + @Test + public void testGetSubsequentQueryResultsWithJob() { + ConnectionImpl connectionSpy = Mockito.spy(connection); + JobId jobId = mock(JobId.class); + BigQueryResultStats bqRsStats = mock(BigQueryResultStats.class); + doReturn(true) + .when(connectionSpy) + .useReadAPI(any(Long.class), any(Long.class), any(Schema.class), any(Boolean.class)); + doReturn(BQ_RS_MOCK_RES) + .when(connectionSpy) + .highThroughPutRead( + any(TableId.class), any(Long.class), any(Schema.class), any(BigQueryResultStats.class)); + + doReturn(TABLE_NAME).when(connectionSpy).getDestinationTable(any(JobId.class)); + doReturn(bqRsStats).when(connectionSpy).getBigQueryResultSetStats(any(JobId.class)); + BigQueryResult res = + connectionSpy.getSubsequentQueryResultsWithJob( + 10000L, 100L, jobId, GET_QUERY_RESULTS_RESPONSE, false); + assertEquals(res.getTotalRows(), 2); + assertEquals(QUERY_SCHEMA, res.getSchema()); + verify(connectionSpy, times(1)) + .getSubsequentQueryResultsWithJob(10000L, 100L, jobId, GET_QUERY_RESULTS_RESPONSE, false); + } + + @Test + public void testGetPageCacheSize() { + ConnectionImpl connectionSpy = Mockito.spy(connection); + // number of cached pages should be within a range + assertTrue(connectionSpy.getPageCacheSize(10000, QUERY_SCHEMA) >= 3); + assertTrue(connectionSpy.getPageCacheSize(100000000, QUERY_SCHEMA) <= 20); + verify(connectionSpy, times(2)).getPageCacheSize(any(Integer.class), any(Schema.class)); + } +} diff --git a/google-cloud-bigquery/src/test/java/com/google/cloud/bigquery/ConnectionSettingsTest.java b/google-cloud-bigquery/src/test/java/com/google/cloud/bigquery/ConnectionSettingsTest.java new file mode 100644 index 000000000..8523825bc --- /dev/null +++ b/google-cloud-bigquery/src/test/java/com/google/cloud/bigquery/ConnectionSettingsTest.java @@ -0,0 +1,166 @@ +/* + * Copyright 2021 Google LLC + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package com.google.cloud.bigquery; + +import static org.junit.Assert.assertEquals; + +import com.google.cloud.bigquery.JobInfo.CreateDisposition; +import com.google.cloud.bigquery.JobInfo.SchemaUpdateOption; +import com.google.cloud.bigquery.JobInfo.WriteDisposition; +import com.google.cloud.bigquery.QueryJobConfiguration.Priority; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableMap; +import java.util.List; +import java.util.Map; +import org.junit.Test; + +public class ConnectionSettingsTest { + private static final String TEST_PROJECT_ID = "test-project-id"; + private static final DatasetId DATASET_ID = DatasetId.of("dataset"); + private static final TableId TABLE_ID = TableId.of("dataset", "table"); + private static final Long REQUEST_TIMEOUT = 10l; + private static final Integer NUM_BUFFERED_ROWS = 100; + private static final Long MAX_RESULTS = 1000l; + private static final List SOURCE_URIS = ImmutableList.of("uri1", "uri2"); + private static final String KEY = "time_zone"; + private static final String VALUE = "US/Eastern"; + private static final ConnectionProperty CONNECTION_PROPERTY = + ConnectionProperty.newBuilder().setKey(KEY).setValue(VALUE).build(); + private static final List CONNECTION_PROPERTIES = + ImmutableList.of(CONNECTION_PROPERTY); + private static final Field FIELD_SCHEMA1 = + Field.newBuilder("StringField", StandardSQLTypeName.STRING) + .setMode(Field.Mode.NULLABLE) + .setDescription("FieldDescription1") + .build(); + private static final Field FIELD_SCHEMA2 = + Field.newBuilder("IntegerField", StandardSQLTypeName.INT64) + .setMode(Field.Mode.REPEATED) + .setDescription("FieldDescription2") + .build(); + private static final Schema TABLE_SCHEMA = Schema.of(FIELD_SCHEMA1, FIELD_SCHEMA2); + private static final Integer MAX_BAD_RECORDS = 42; + private static final Boolean IGNORE_UNKNOWN_VALUES = true; + private static final String COMPRESSION = "GZIP"; + private static final CsvOptions CSV_OPTIONS = CsvOptions.newBuilder().build(); + private static final ExternalTableDefinition TABLE_CONFIGURATION = + ExternalTableDefinition.newBuilder(SOURCE_URIS, TABLE_SCHEMA, CSV_OPTIONS) + .setCompression(COMPRESSION) + .setIgnoreUnknownValues(IGNORE_UNKNOWN_VALUES) + .setMaxBadRecords(MAX_BAD_RECORDS) + .build(); + private static final Map TABLE_DEFINITIONS = + ImmutableMap.of("tableName", TABLE_CONFIGURATION); + private static final CreateDisposition CREATE_DISPOSITION = CreateDisposition.CREATE_IF_NEEDED; + private static final WriteDisposition WRITE_DISPOSITION = WriteDisposition.WRITE_APPEND; + private static final Priority PRIORITY = Priority.BATCH; + private static final boolean ALLOW_LARGE_RESULTS = true; + private static final boolean USE_QUERY_CACHE = false; + private static final boolean FLATTEN_RESULTS = true; + private static final Integer MAX_BILLING_TIER = 123; + private static final Long MAX_BYTES_BILL = 12345L; + private static final List SCHEMA_UPDATE_OPTIONS = + ImmutableList.of(SchemaUpdateOption.ALLOW_FIELD_RELAXATION); + private static final List USER_DEFINED_FUNCTIONS = + ImmutableList.of(UserDefinedFunction.inline("Function"), UserDefinedFunction.fromUri("URI")); + private static final EncryptionConfiguration JOB_ENCRYPTION_CONFIGURATION = + EncryptionConfiguration.newBuilder().setKmsKeyName("KMS_KEY_1").build(); + private static final TimePartitioning TIME_PARTITIONING = + TimePartitioning.of(TimePartitioning.Type.DAY); + private static final Clustering CLUSTERING = + Clustering.newBuilder().setFields(ImmutableList.of("Foo", "Bar")).build(); + private static final Long TIMEOUT = 10L; + private static final RangePartitioning.Range RANGE = + RangePartitioning.Range.newBuilder().setStart(1L).setInterval(2L).setEnd(10L).build(); + private static final RangePartitioning RANGE_PARTITIONING = + RangePartitioning.newBuilder().setField("IntegerField").setRange(RANGE).build(); + + private static final ConnectionSettings CONNECTION_SETTINGS = + ConnectionSettings.newBuilder() + .setRequestTimeout(REQUEST_TIMEOUT) + .setNumBufferedRows(NUM_BUFFERED_ROWS) + .setMaxResults(MAX_RESULTS) + .setUseQueryCache(USE_QUERY_CACHE) + .setTableDefinitions(TABLE_DEFINITIONS) + .setAllowLargeResults(ALLOW_LARGE_RESULTS) + .setCreateDisposition(CREATE_DISPOSITION) + .setDefaultDataset(DATASET_ID) + .setDestinationTable(TABLE_ID) + .setWriteDisposition(WRITE_DISPOSITION) + .setPriority(PRIORITY) + .setFlattenResults(FLATTEN_RESULTS) + .setUserDefinedFunctions(USER_DEFINED_FUNCTIONS) + .setMaximumBillingTier(MAX_BILLING_TIER) + .setMaximumBytesBilled(MAX_BYTES_BILL) + .setSchemaUpdateOptions(SCHEMA_UPDATE_OPTIONS) + .setDestinationEncryptionConfiguration(JOB_ENCRYPTION_CONFIGURATION) + .setTimePartitioning(TIME_PARTITIONING) + .setClustering(CLUSTERING) + .setJobTimeoutMs(TIMEOUT) + .setRangePartitioning(RANGE_PARTITIONING) + .setConnectionProperties(CONNECTION_PROPERTIES) + .build(); + + @Test + public void testToBuilder() { + compareConnectionSettings(CONNECTION_SETTINGS, CONNECTION_SETTINGS.toBuilder().build()); + } + + @Test + public void testToBuilderIncomplete() { + ConnectionSettings connectionSettings = + ConnectionSettings.newBuilder().setDefaultDataset(DATASET_ID).build(); + compareConnectionSettings(connectionSettings, connectionSettings.toBuilder().build()); + } + + @Test + public void testBuilder() { + assertEquals(REQUEST_TIMEOUT, CONNECTION_SETTINGS.getRequestTimeout()); + assertEquals(NUM_BUFFERED_ROWS, CONNECTION_SETTINGS.getNumBufferedRows()); + assertEquals(MAX_RESULTS, CONNECTION_SETTINGS.getMaxResults()); + } + + private void compareConnectionSettings(ConnectionSettings expected, ConnectionSettings value) { + assertEquals(expected, value); + assertEquals(expected.hashCode(), value.hashCode()); + assertEquals(expected.toString(), value.toString()); + assertEquals(expected.getRequestTimeout(), value.getRequestTimeout()); + assertEquals(expected.getNumBufferedRows(), value.getNumBufferedRows()); + assertEquals(expected.getMaxResults(), value.getMaxResults()); + assertEquals(expected.getAllowLargeResults(), value.getAllowLargeResults()); + assertEquals(expected.getCreateDisposition(), value.getCreateDisposition()); + assertEquals(expected.getDefaultDataset(), value.getDefaultDataset()); + assertEquals(expected.getDestinationTable(), value.getDestinationTable()); + assertEquals(expected.getFlattenResults(), value.getFlattenResults()); + assertEquals(expected.getPriority(), value.getPriority()); + assertEquals(expected.getTableDefinitions(), value.getTableDefinitions()); + assertEquals(expected.getUseQueryCache(), value.getUseQueryCache()); + assertEquals(expected.getUserDefinedFunctions(), value.getUserDefinedFunctions()); + assertEquals(expected.getWriteDisposition(), value.getWriteDisposition()); + assertEquals(expected.getMaximumBillingTier(), value.getMaximumBillingTier()); + assertEquals(expected.getMaximumBytesBilled(), value.getMaximumBytesBilled()); + assertEquals(expected.getSchemaUpdateOptions(), value.getSchemaUpdateOptions()); + assertEquals( + expected.getDestinationEncryptionConfiguration(), + value.getDestinationEncryptionConfiguration()); + assertEquals(expected.getTimePartitioning(), value.getTimePartitioning()); + assertEquals(expected.getClustering(), value.getClustering()); + assertEquals(expected.getJobTimeoutMs(), value.getJobTimeoutMs()); + assertEquals(expected.getRangePartitioning(), value.getRangePartitioning()); + assertEquals(expected.getConnectionProperties(), value.getConnectionProperties()); + } +} diff --git a/google-cloud-bigquery/src/test/java/com/google/cloud/bigquery/it/ITBigQueryTest.java b/google-cloud-bigquery/src/test/java/com/google/cloud/bigquery/it/ITBigQueryTest.java index f3b60cebe..348749b46 100644 --- a/google-cloud-bigquery/src/test/java/com/google/cloud/bigquery/it/ITBigQueryTest.java +++ b/google-cloud-bigquery/src/test/java/com/google/cloud/bigquery/it/ITBigQueryTest.java @@ -49,10 +49,15 @@ import com.google.cloud.bigquery.BigQuery.JobOption; import com.google.cloud.bigquery.BigQuery.TableField; import com.google.cloud.bigquery.BigQuery.TableOption; +import com.google.cloud.bigquery.BigQueryDryRunResult; import com.google.cloud.bigquery.BigQueryError; import com.google.cloud.bigquery.BigQueryException; +import com.google.cloud.bigquery.BigQueryResult; +import com.google.cloud.bigquery.BigQuerySQLException; import com.google.cloud.bigquery.Clustering; +import com.google.cloud.bigquery.Connection; import com.google.cloud.bigquery.ConnectionProperty; +import com.google.cloud.bigquery.ConnectionSettings; import com.google.cloud.bigquery.CopyJobConfiguration; import com.google.cloud.bigquery.Dataset; import com.google.cloud.bigquery.DatasetId; @@ -62,6 +67,7 @@ import com.google.cloud.bigquery.Field; import com.google.cloud.bigquery.FieldList; import com.google.cloud.bigquery.FieldValue; +import com.google.cloud.bigquery.FieldValue.Attribute; import com.google.cloud.bigquery.FieldValueList; import com.google.cloud.bigquery.FormatOptions; import com.google.cloud.bigquery.HivePartitioningOptions; @@ -72,6 +78,9 @@ import com.google.cloud.bigquery.JobInfo; import com.google.cloud.bigquery.JobStatistics; import com.google.cloud.bigquery.JobStatistics.LoadStatistics; +import com.google.cloud.bigquery.JobStatistics.QueryStatistics; +import com.google.cloud.bigquery.JobStatistics.QueryStatistics.StatementType; +import com.google.cloud.bigquery.JobStatistics.SessionInfo; import com.google.cloud.bigquery.JobStatistics.TransactionInfo; import com.google.cloud.bigquery.LegacySQLTypeName; import com.google.cloud.bigquery.LoadJobConfiguration; @@ -79,6 +88,7 @@ import com.google.cloud.bigquery.Model; import com.google.cloud.bigquery.ModelId; import com.google.cloud.bigquery.ModelInfo; +import com.google.cloud.bigquery.Parameter; import com.google.cloud.bigquery.ParquetOptions; import com.google.cloud.bigquery.PolicyTags; import com.google.cloud.bigquery.QueryJobConfiguration; @@ -129,7 +139,11 @@ import java.math.BigDecimal; import java.nio.ByteBuffer; import java.nio.charset.StandardCharsets; +import java.sql.ResultSet; +import java.sql.SQLException; +import java.sql.Time; import java.time.Instant; +import java.time.LocalTime; import java.time.Period; import java.util.ArrayList; import java.util.Collection; @@ -273,6 +287,103 @@ public class ITBigQueryTest { BIGNUMERIC_FIELD_SCHEMA3, BIGNUMERIC_FIELD_SCHEMA4); + private static final Schema BQ_RESULTSET_SCHEMA = + Schema.of( + Field.newBuilder("TimestampField", StandardSQLTypeName.TIMESTAMP) + .setMode(Field.Mode.NULLABLE) + .setDescription("TimestampDescription") + .build(), + Field.newBuilder("StringField", StandardSQLTypeName.STRING) + .setMode(Field.Mode.NULLABLE) + .setDescription("StringDescription") + .build(), + Field.newBuilder("IntegerArrayField", StandardSQLTypeName.NUMERIC) + .setMode(Field.Mode.REPEATED) + .setDescription("IntegerArrayDescription") + .build(), + Field.newBuilder("BooleanField", StandardSQLTypeName.BOOL) + .setMode(Field.Mode.NULLABLE) + .setDescription("BooleanDescription") + .build(), + Field.newBuilder("BytesField", StandardSQLTypeName.BYTES) + .setMode(Field.Mode.NULLABLE) + .setDescription("BytesDescription") + .build(), + Field.newBuilder( + "RecordField", + StandardSQLTypeName.STRUCT, + Field.newBuilder("TimestampField", StandardSQLTypeName.TIMESTAMP) + .setMode(Field.Mode.NULLABLE) + .setDescription("TimestampDescription") + .build(), + Field.newBuilder("StringField", StandardSQLTypeName.STRING) + .setMode(Field.Mode.NULLABLE) + .setDescription("StringDescription") + .build(), + Field.newBuilder("IntegerArrayField", StandardSQLTypeName.NUMERIC) + .setMode(Field.Mode.REPEATED) + .setDescription("IntegerArrayDescription") + .build(), + Field.newBuilder("BooleanField", StandardSQLTypeName.BOOL) + .setMode(Field.Mode.NULLABLE) + .setDescription("BooleanDescription") + .build(), + Field.newBuilder("BytesField", StandardSQLTypeName.BYTES) + .setMode(Field.Mode.NULLABLE) + .setDescription("BytesDescription") + .build()) + .setMode(Field.Mode.REQUIRED) + .setDescription("RecordDescription") + .build(), + Field.newBuilder("IntegerField", StandardSQLTypeName.NUMERIC) + .setMode(Field.Mode.NULLABLE) + .setDescription("IntegerDescription") + .build(), + Field.newBuilder("FloatField", StandardSQLTypeName.NUMERIC) + .setMode(Field.Mode.NULLABLE) + .setDescription("FloatDescription") + .build(), + Field.newBuilder("GeographyField", StandardSQLTypeName.GEOGRAPHY) + .setMode(Field.Mode.NULLABLE) + .setDescription("GeographyDescription") + .build(), + Field.newBuilder("NumericField", StandardSQLTypeName.NUMERIC) + .setMode(Field.Mode.NULLABLE) + .setDescription("NumericDescription") + .build(), + Field.newBuilder("BigNumericField", StandardSQLTypeName.BIGNUMERIC) + .setMode(Field.Mode.NULLABLE) + .setDescription("BigNumericDescription") + .build(), + Field.newBuilder("BigNumericField1", StandardSQLTypeName.BIGNUMERIC) + .setMode(Field.Mode.NULLABLE) + .setDescription("BigNumeric1Description") + .build(), + Field.newBuilder("BigNumericField2", StandardSQLTypeName.BIGNUMERIC) + .setMode(Field.Mode.NULLABLE) + .setDescription("BigNumeric2Description") + .build(), + Field.newBuilder("BigNumericField3", StandardSQLTypeName.BIGNUMERIC) + .setMode(Field.Mode.NULLABLE) + .setDescription("BigNumeric3Description") + .build(), + Field.newBuilder("BigNumericField4", StandardSQLTypeName.BIGNUMERIC) + .setMode(Field.Mode.NULLABLE) + .setDescription("BigNumeric4Description") + .build(), + Field.newBuilder("TimeField", StandardSQLTypeName.TIME) + .setMode(Field.Mode.NULLABLE) + .setDescription("TimeDescription") + .build(), + Field.newBuilder("DateField", StandardSQLTypeName.DATE) + .setMode(Field.Mode.NULLABLE) + .setDescription("DateDescription") + .build(), + Field.newBuilder("DateTimeField", StandardSQLTypeName.DATETIME) + .setMode(Field.Mode.NULLABLE) + .setDescription("DateTimeDescription") + .build()); + private static final Field DDL_TIMESTAMP_FIELD_SCHEMA = Field.newBuilder("TimestampField", LegacySQLTypeName.TIMESTAMP) .setDescription("TimestampDescription") @@ -316,6 +427,55 @@ public class ITBigQueryTest { Field.newBuilder("BooleanField", LegacySQLTypeName.BOOLEAN) .setMode(Field.Mode.NULLABLE) .build()); + + private static final Schema BQ_RESULTSET_EXPECTED_SCHEMA = + Schema.of( + Field.newBuilder("StringField", StandardSQLTypeName.STRING) + .setMode(Field.Mode.NULLABLE) + .build(), + Field.newBuilder("BigNumericField", StandardSQLTypeName.BIGNUMERIC) + .setMode(Field.Mode.NULLABLE) + .build(), + Field.newBuilder("BooleanField", StandardSQLTypeName.BOOL) + .setMode(Field.Mode.NULLABLE) + .build(), + Field.newBuilder("BytesField", StandardSQLTypeName.BYTES) + .setMode(Field.Mode.NULLABLE) + .build(), + Field.newBuilder("IntegerField", StandardSQLTypeName.NUMERIC) + .setMode(Field.Mode.NULLABLE) + .build(), + Field.newBuilder("TimestampField", StandardSQLTypeName.TIMESTAMP) + .setMode(Field.Mode.NULLABLE) + .build(), + Field.newBuilder("FloatField", StandardSQLTypeName.NUMERIC) + .setMode(Field.Mode.NULLABLE) + .build(), + Field.newBuilder("NumericField", StandardSQLTypeName.NUMERIC) + .setMode(Field.Mode.NULLABLE) + .build(), + Field.newBuilder("TimeField", StandardSQLTypeName.TIME) + .setMode(Field.Mode.NULLABLE) + .build(), + Field.newBuilder("DateField", StandardSQLTypeName.DATE) + .setMode(Field.Mode.NULLABLE) + .build(), + Field.newBuilder("DateTimeField", StandardSQLTypeName.DATETIME) + .setMode(Field.Mode.NULLABLE) + .build(), + Field.newBuilder("GeographyField", StandardSQLTypeName.GEOGRAPHY) + .setMode(Field.Mode.NULLABLE) + .build(), + Field.newBuilder("BytesField_1", StandardSQLTypeName.BYTES) + .setMode(Field.Mode.NULLABLE) + .build(), + Field.newBuilder("BooleanField_1", StandardSQLTypeName.BOOL) + .setMode(Field.Mode.NULLABLE) + .build(), + Field.newBuilder("IntegerArrayField", StandardSQLTypeName.NUMERIC) + .setMode(Field.Mode.REPEATED) + .build()); + private static final Schema QUERY_RESULT_SCHEMA_BIGNUMERIC = Schema.of( Field.newBuilder("TimestampField", LegacySQLTypeName.TIMESTAMP) @@ -360,6 +520,7 @@ public class ITBigQueryTest { private static final String LOAD_FILE = "load.csv"; private static final String LOAD_FILE_LARGE = "load_large.csv"; private static final String JSON_LOAD_FILE = "load.json"; + private static final String JSON_LOAD_FILE_BQ_RESULTSET = "load_bq_resultset.json"; private static final String JSON_LOAD_FILE_SIMPLE = "load_simple.json"; private static final String EXTRACT_FILE = "extract.csv"; private static final String EXTRACT_MODEL_FILE = "extract_model.csv"; @@ -368,7 +529,10 @@ public class ITBigQueryTest { private static final TableId TABLE_ID_DDL = TableId.of(DATASET, "ddl_testing_table"); private static final TableId TABLE_ID_FASTQUERY = TableId.of(DATASET, "fastquery_testing_table"); private static final TableId TABLE_ID_LARGE = TableId.of(DATASET, "large_data_testing_table"); + private static final TableId TABLE_ID_FASTQUERY_BQ_RESULTSET = + TableId.of(DATASET, "fastquery_testing_bq_resultset"); private static final String CSV_CONTENT = "StringValue1\nStringValue2\n"; + private static final String JSON_CONTENT = "{" + " \"TimestampField\": \"2014-08-19 07:41:35.220 -05:00\"," @@ -424,6 +588,64 @@ public class ITBigQueryTest { + " \"BigNumericField3\": \"578960446186580977117854925043439539266.34992332820282019728792003956564819967\"," + " \"BigNumericField4\": \"-578960446186580977117854925043439539266.34992332820282019728792003956564819968\"" + "}"; + + private static final String JSON_CONTENT_BQ_RESULTSET = + "{" + + " \"TimestampField\": null," + + " \"StringField\": null," + + " \"IntegerArrayField\": null," + + " \"BooleanField\": null," + + " \"BytesField\": null," + + " \"RecordField\": {" + + " \"TimestampField\": null," + + " \"StringField\": null," + + " \"IntegerArrayField\": null," + + " \"BooleanField\": null," + + " \"BytesField\": null" + + " }," + + " \"IntegerField\": null," + + " \"FloatField\": null," + + " \"GeographyField\": null," + + " \"NumericField\": null," + + " \"BigNumericField\": null," + + " \"BigNumericField1\": null," + + " \"BigNumericField2\": null," + + " \"BigNumericField3\": null," + + " \"BigNumericField4\": null," + + " \"TimeField\": null," + + " \"DateField\": null," + + " \"DateTimeField\": null" + + "}\n" + + "{" + + " \"TimestampField\": \"2018-08-19 12:11:35.123456 UTC\"," + + " \"StringField\": \"StringValue1\"," + + " \"IntegerArrayField\": [1,2,3,4]," + + " \"BooleanField\": \"false\"," + + " \"BytesField\": \"" + + BYTES_BASE64 + + "\"," + + " \"RecordField\": {" + + " \"TimestampField\": \"1969-07-20 20:18:04 UTC\"," + + " \"StringField\": null," + + " \"IntegerArrayField\": [1,0]," + + " \"BooleanField\": \"true\"," + + " \"BytesField\": \"" + + BYTES_BASE64 + + "\"" + + " }," + + " \"IntegerField\": \"1\"," + + " \"FloatField\": \"10.1\"," + + " \"GeographyField\": \"POINT(-122.35022 47.649154)\"," + + " \"NumericField\": \"100\"," + + " \"BigNumericField\": \"0.33333333333333333333333333333333333333\"," + + " \"BigNumericField1\": \"1e-38\"," + + " \"BigNumericField2\": \"-1e38\"," + + " \"BigNumericField3\": \"578960446186580977117854925043439539266.34992332820282019728792003956564819967\"," + + " \"BigNumericField4\": \"-578960446186580977117854925043439539266.34992332820282019728792003956564819968\"," + + " \"TimeField\": \"12:11:35.123456\"," + + " \"DateField\": \"2018-08-19\"," + + " \"DateTimeField\": \"2018-08-19 12:11:35.123456\"" + + "}"; private static final String JSON_CONTENT_SIMPLE = "{" + " \"TimestampField\": \"2014-08-19 07:41:35.220 -05:00\"," @@ -476,6 +698,11 @@ public static void beforeClass() throws InterruptedException, IOException { ITBigQueryTest.class.getClassLoader().getResourceAsStream("QueryTestData.csv"); storage.createFrom( BlobInfo.newBuilder(BUCKET, LOAD_FILE_LARGE).setContentType("text/plain").build(), stream); + storage.create( + BlobInfo.newBuilder(BUCKET, JSON_LOAD_FILE_BQ_RESULTSET) + .setContentType("application/json") + .build(), + JSON_CONTENT_BQ_RESULTSET.getBytes(StandardCharsets.UTF_8)); DatasetInfo info = DatasetInfo.newBuilder(DATASET).setDescription(DESCRIPTION).setLabels(LABELS).build(); bigquery.create(info); @@ -509,6 +736,19 @@ public static void beforeClass() throws InterruptedException, IOException { jobFastQuery = jobFastQuery.waitFor(); assertNull(jobFastQuery.getStatus().getError()); + LoadJobConfiguration configFastQueryBQResultset = + LoadJobConfiguration.newBuilder( + TABLE_ID_FASTQUERY_BQ_RESULTSET, + "gs://" + BUCKET + "/" + JSON_LOAD_FILE_BQ_RESULTSET, + FormatOptions.json()) + .setCreateDisposition(JobInfo.CreateDisposition.CREATE_IF_NEEDED) + .setSchema(BQ_RESULTSET_SCHEMA) + .setLabels(labels) + .build(); + Job jobFastQueryBQResultSet = bigquery.create(JobInfo.of(configFastQueryBQResultset)); + jobFastQueryBQResultSet = jobFastQueryBQResultSet.waitFor(); + assertNull(jobFastQueryBQResultSet.getStatus().getError()); + LoadJobConfiguration configurationDDL = LoadJobConfiguration.newBuilder( TABLE_ID_DDL, "gs://" + BUCKET + "/" + JSON_LOAD_FILE_SIMPLE, FormatOptions.json()) @@ -712,6 +952,7 @@ public void testCreateTableWithRangePartitioning() { } } + /* TODO(prasmish): replicate this test case for executeSelect on the relevant part */ @Test public void testJsonType() throws InterruptedException { String tableName = "test_create_table_jsontype"; @@ -819,6 +1060,7 @@ public void testJsonType() throws InterruptedException { } } + /* TODO(prasmish): replicate this test case for executeSelect on the relevant part */ @Test public void testIntervalType() throws InterruptedException { String tableName = "test_create_table_intervaltype"; @@ -1701,6 +1943,7 @@ public void testInsertAllWithErrors() { assertTrue(bigquery.delete(TableId.of(DATASET, tableName))); } + /* TODO(prasmish): replicate the entire test case for executeSelect */ @Test public void testListAllTableData() { Page rows = bigquery.listTableData(TABLE_ID); @@ -2036,6 +2279,7 @@ public void testAuthorizeDataset() { assertEquals(sharedDatasetAcl, updatedDataset.getAcl()); } + /* TODO(prasmish): replicate the entire test case for executeSelect */ @Test public void testSingleStatementsQueryException() throws InterruptedException { String invalidQuery = @@ -2052,6 +2296,7 @@ public void testSingleStatementsQueryException() throws InterruptedException { } } + /* TODO(prasmish): replicate the entire test case for executeSelect */ @Test public void testMultipleStatementsQueryException() throws InterruptedException { String invalidQuery = @@ -2070,6 +2315,7 @@ public void testMultipleStatementsQueryException() throws InterruptedException { } } + /* TODO(prasmish): replicate the entire test case for executeSelect */ @Test public void testQuery() throws InterruptedException { String query = "SELECT TimestampField, StringField, BooleanField FROM " + TABLE_ID.getTable(); @@ -2102,6 +2348,16 @@ public void testQuery() throws InterruptedException { assertNotNull(statistics.getQueryPlan()); } + @Test + public void testExecuteSelectDefaultConnectionSettings() throws SQLException { + // Use the default connection settings + Connection connection = bigquery.createConnection(); + String query = "SELECT corpus FROM `bigquery-public-data.samples.shakespeare` GROUP BY corpus;"; + BigQueryResult bigQueryResult = connection.executeSelect(query); + assertEquals(42, bigQueryResult.getTotalRows()); + } + + /* TODO(prasmish): replicate the entire test case for executeSelect */ @Test public void testQueryTimeStamp() throws InterruptedException { String query = "SELECT TIMESTAMP '2022-01-24T23:54:25.095574Z'"; @@ -2136,6 +2392,7 @@ public void testQueryTimeStamp() throws InterruptedException { } } + /* TODO(prasmish): replicate the entire test case for executeSelect */ @Test public void testQueryCaseInsensitiveSchemaFieldByGetName() throws InterruptedException { String query = "SELECT TimestampField, StringField, BooleanField FROM " + TABLE_ID.getTable(); @@ -2164,6 +2421,7 @@ public void testQueryCaseInsensitiveSchemaFieldByGetName() throws InterruptedExc assertEquals(2, rowCount); } + /* TODO(prasmish): replicate bigquery.query part of the test case for executeSelect - modify this test case */ @Test public void testQueryExternalHivePartitioningOptionAutoLayout() throws InterruptedException { String tableName = "test_queryexternalhivepartition_autolayout_table"; @@ -2198,6 +2456,7 @@ public void testQueryExternalHivePartitioningOptionAutoLayout() throws Interrupt assertTrue(bigquery.delete(tableId)); } + /* TODO(prasmish): replicate bigquery.query part of the test case for executeSelect - modify this test case */ @Test public void testQueryExternalHivePartitioningOptionCustomLayout() throws InterruptedException { String tableName = "test_queryexternalhivepartition_customlayout_table"; @@ -2233,6 +2492,463 @@ public void testQueryExternalHivePartitioningOptionCustomLayout() throws Interru assertTrue(bigquery.delete(tableId)); } + @Test + public void testConnectionImplDryRun() throws SQLException { + String query = + String.format( + "select StringField, BigNumericField, BooleanField, BytesField, IntegerField, TimestampField, FloatField, NumericField, TimeField, DateField, DateTimeField , GeographyField, RecordField.BytesField, RecordField.BooleanField, IntegerArrayField from %s where StringField = ? order by TimestampField", + TABLE_ID_FASTQUERY_BQ_RESULTSET.getTable()); + ConnectionSettings connectionSettings = + ConnectionSettings.newBuilder() + .setDefaultDataset(DatasetId.of(DATASET)) + .setCreateSession(true) + .build(); + Connection connection = bigquery.createConnection(connectionSettings); + BigQueryDryRunResult bigQueryDryRunResultSet = connection.dryRun(query); + assertNotNull(bigQueryDryRunResultSet.getSchema()); + assertEquals( + BQ_RESULTSET_EXPECTED_SCHEMA, bigQueryDryRunResultSet.getSchema()); // match the schema + List queryParameters = bigQueryDryRunResultSet.getQueryParameters(); + assertEquals(StandardSQLTypeName.STRING, queryParameters.get(0).getValue().getType()); + QueryStatistics queryStatistics = bigQueryDryRunResultSet.getStatistics().getQueryStatistics(); + assertNotNull(queryStatistics); + SessionInfo sessionInfo = bigQueryDryRunResultSet.getStatistics().getSessionInfo(); + assertNotNull(sessionInfo.getSessionId()); + assertEquals(StatementType.SELECT, queryStatistics.getStatementType()); + } + + @Test + // This test case test the order of the records, making sure that the result is not jumbled up due + // to the multithreaded BigQueryResult implementation + public void testBQResultSetMultiThreadedOrder() throws SQLException { + String query = + "SELECT date FROM " + + TABLE_ID_LARGE.getTable() + + " where date is not null order by date asc limit 300000"; + ConnectionSettings connectionSettings = + ConnectionSettings.newBuilder() + .setDefaultDataset(DatasetId.of(DATASET)) + .setNumBufferedRows(10000) // page size + .build(); + Connection connection = bigquery.createConnection(connectionSettings); + BigQueryResult bigQueryResult = connection.executeSelect(query); + ResultSet rs = bigQueryResult.getResultSet(); + int cnt = 0; + assertTrue(rs.next()); + ++cnt; + java.sql.Date lastDate = rs.getDate(0); + while (rs.next()) { + assertNotNull(rs.getDate(0)); + assertTrue(rs.getDate(0).getTime() >= lastDate.getTime()); // sorted order is maintained + lastDate = rs.getDate(0); + ++cnt; + } + assertEquals(300000, cnt); // total 300000 rows should be read + } + + @Test + public void testBQResultSetPaginationSlowQuery() throws SQLException { + String query = + "SELECT date, county, state_name, confirmed_cases, deaths FROM " + + TABLE_ID_LARGE.getTable() + + " where date is not null and county is not null and state_name is not null order by date limit 300000"; + ConnectionSettings connectionSettings = + ConnectionSettings.newBuilder() + .setDefaultDataset(DatasetId.of(DATASET)) + .setNumBufferedRows(10000) // page size + .setJobTimeoutMs( + 15000L) // So that ConnectionImpl.isFastQuerySupported returns false, and the slow + // query route gets executed + .build(); + Connection connection = bigquery.createConnection(connectionSettings); + BigQueryResult bigQueryResult = connection.executeSelect(query); + ResultSet rs = bigQueryResult.getResultSet(); + int cnt = 0; + while (rs.next()) { // pagination starts after approx 120,000 records + assertNotNull(rs.getDate(0)); + assertNotNull(rs.getString(1)); + assertNotNull(rs.getString(2)); + assertTrue(rs.getInt(3) >= 0); + assertTrue(rs.getInt(4) >= 0); + ++cnt; + } + assertEquals(300000, cnt); // total 300000 rows should be read + } + + @Test + public void testExecuteSelectSinglePageTableRow() throws SQLException { + String query = + "select StringField, BigNumericField, BooleanField, BytesField, IntegerField, TimestampField, FloatField, " + + "NumericField, TimeField, DateField, DateTimeField , GeographyField, RecordField.BytesField, RecordField.BooleanField, IntegerArrayField from " + + TABLE_ID_FASTQUERY_BQ_RESULTSET.getTable() + + " order by TimestampField"; + ConnectionSettings connectionSettings = + ConnectionSettings.newBuilder().setDefaultDataset(DatasetId.of(DATASET)).build(); + Connection connection = bigquery.createConnection(connectionSettings); + BigQueryResult bigQueryResult = connection.executeSelect(query); + ResultSet rs = bigQueryResult.getResultSet(); + Schema sc = bigQueryResult.getSchema(); + + assertEquals(BQ_RESULTSET_EXPECTED_SCHEMA, sc); // match the schema + assertEquals(2, bigQueryResult.getTotalRows()); // Expecting 2 rows + + assertTrue(rs.next()); // first row + // checking for the null or 0 column values + assertNull(rs.getString("StringField")); + assertTrue(rs.getDouble("BigNumericField") == 0.0d); + assertFalse(rs.getBoolean("BooleanField")); + assertNull(rs.getBytes("BytesField")); + assertEquals(rs.getInt("IntegerField"), 0); + assertNull(rs.getTimestamp("TimestampField")); + assertNull(rs.getDate("DateField")); + assertTrue(rs.getDouble("FloatField") == 0.0d); + assertTrue(rs.getDouble("NumericField") == 0.0d); + assertNull(rs.getTime("TimeField")); + assertNull(rs.getString("DateTimeField")); + assertNull(rs.getString("GeographyField")); + assertNull(rs.getBytes("BytesField_1")); + assertFalse(rs.getBoolean("BooleanField_1")); + + assertTrue(rs.next()); // second row + // second row is non null, comparing the values + assertEquals("StringValue1", rs.getString("StringField")); + assertTrue(rs.getDouble("BigNumericField") == 0.3333333333333333d); + assertFalse(rs.getBoolean("BooleanField")); + assertNotNull(rs.getBytes("BytesField")); + assertEquals(1, rs.getInt("IntegerField")); + assertEquals(1534680695123L, rs.getTimestamp("TimestampField").getTime()); + assertEquals(java.sql.Date.valueOf("2018-08-19"), rs.getDate("DateField")); + assertTrue(rs.getDouble("FloatField") == 10.1d); + assertTrue(rs.getDouble("NumericField") == 100.0d); + assertEquals(Time.valueOf(LocalTime.of(12, 11, 35, 123456)), rs.getTime("TimeField")); + assertEquals("2018-08-19T12:11:35.123456", rs.getString("DateTimeField")); + assertEquals("POINT(-122.35022 47.649154)", rs.getString("GeographyField")); + assertNotNull(rs.getBytes("BytesField_1")); + assertTrue(rs.getBoolean("BooleanField_1")); + assertTrue( + rs.getObject("IntegerArrayField") instanceof com.google.cloud.bigquery.FieldValueList); + FieldValueList integerArrayFieldValue = + (com.google.cloud.bigquery.FieldValueList) rs.getObject("IntegerArrayField"); + assertEquals(4, integerArrayFieldValue.size()); // Array has 4 elements + assertEquals(3, (integerArrayFieldValue.get(2).getNumericValue()).intValue()); + + assertFalse(rs.next()); // no 3rd row in the table + } + + @Test + public void testConnectionClose() throws SQLException { + String query = + "SELECT date, county, state_name, confirmed_cases, deaths FROM " + + TABLE_ID_LARGE.getTable() + + " where date is not null and county is not null and state_name is not null order by date limit 300000"; + ConnectionSettings connectionSettings = + ConnectionSettings.newBuilder() + .setDefaultDataset(DatasetId.of(DATASET)) + .setNumBufferedRows(10000) // page size + .build(); + Connection connection = bigquery.createConnection(connectionSettings); + BigQueryResult bigQueryResult = connection.executeSelect(query); + ResultSet rs = bigQueryResult.getResultSet(); + int cnt = 0; + while (rs.next()) { + ++cnt; + if (cnt > 57000) { // breaking at 57K, query reads 300K + assertTrue(connection.close()); // we should be able to cancel the connection + } + } + assertTrue( + cnt < 60000); // Few extra records are still read (generally ~10) even after canceling, as + // the backgrounds threads are still active while the interrupt occurs and the + // buffer and pageCache are cleared + } + + @Test + public void testBQResultSetPagination() throws SQLException { + String query = + "SELECT date, county, state_name, confirmed_cases, deaths FROM " + + TABLE_ID_LARGE.getTable() + + " where date is not null and county is not null and state_name is not null order by date limit 300000"; + ConnectionSettings connectionSettings = + ConnectionSettings.newBuilder() + .setDefaultDataset(DatasetId.of(DATASET)) + .setNumBufferedRows(10000) // page size + .build(); + Connection connection = bigquery.createConnection(connectionSettings); + BigQueryResult bigQueryResult = connection.executeSelect(query); + ResultSet rs = bigQueryResult.getResultSet(); + int cnt = 0; + while (rs.next()) { // pagination starts after approx 120,000 records + assertNotNull(rs.getDate(0)); + assertNotNull(rs.getString(1)); + assertNotNull(rs.getString(2)); + assertTrue(rs.getInt(3) >= 0); + assertTrue(rs.getInt(4) >= 0); + ++cnt; + } + assertEquals(300000, cnt); // total 300000 rows should be read + } + + @Test + public void testReadAPIIterationAndOrder() + throws SQLException { // use read API to read 300K records and check the order + String query = + "SELECT date, county, state_name, confirmed_cases, deaths FROM " + + TABLE_ID_LARGE.getTable() + + " where date is not null and county is not null and state_name is not null order by confirmed_cases asc limit 300000"; + + ConnectionSettings connectionSettings = + ConnectionSettings.newBuilder() + .setDefaultDataset(DatasetId.of(DATASET)) + .setPriority( + QueryJobConfiguration.Priority + .INTERACTIVE) // required for this integration test so that isFastQuerySupported + // returns false + .build(); + Connection connection = bigquery.createConnection(connectionSettings); + BigQueryResult bigQueryResult = connection.executeSelect(query); + ResultSet rs = bigQueryResult.getResultSet(); + int cnt = 0; + int lasConfirmedCases = Integer.MIN_VALUE; + while (rs.next()) { // pagination starts after approx 120,000 records + assertNotNull(rs.getDate(0)); + assertNotNull(rs.getString(1)); + assertNotNull(rs.getString(2)); + assertTrue(rs.getInt(3) >= 0); + assertTrue(rs.getInt(4) >= 0); + + // check if the records are sorted + assertTrue(rs.getInt(3) >= lasConfirmedCases); + lasConfirmedCases = rs.getInt(3); + ++cnt; + } + assertEquals(300000, cnt); // total 300000 rows should be read + connection.close(); + } + + @Test + public void testReadAPIConnectionMultiClose() + throws + SQLException { // use read API to read 300K records, then closes the connection. This test + // repeats it multiple times and assets if the connection was closed + String query = + "SELECT date, county, state_name, confirmed_cases, deaths FROM " + + TABLE_ID_LARGE.getTable() + + " where date is not null and county is not null and state_name is not null order by confirmed_cases asc limit 300000"; + + ConnectionSettings connectionSettings = + ConnectionSettings.newBuilder() + .setDefaultDataset(DatasetId.of(DATASET)) + .setPriority( + QueryJobConfiguration.Priority + .INTERACTIVE) // required for this integration test so that isFastQuerySupported + // returns false + .build(); + int closeCnt = 0, runCnt = 3; + for (int run = 0; run < runCnt; run++) { + Connection connection = bigquery.createConnection(connectionSettings); + BigQueryResult bigQueryResult = connection.executeSelect(query); + ResultSet rs = bigQueryResult.getResultSet(); + int cnt = 0; + while (rs.next()) { // pagination starts after approx 120,000 records + assertNotNull(rs.getDate(0)); + ++cnt; + } + assertEquals(300000, cnt); // total 300000 rows should be read + assertTrue(connection.close()); // check if connection closed + closeCnt++; + } + assertEquals( + closeCnt, runCnt); // check if the connection closed for the required number of times + } + + @Test + public void testExecuteSelectSinglePageTableRowColInd() throws SQLException { + String query = + "select StringField, BigNumericField, BooleanField, BytesField, IntegerField, TimestampField, FloatField, " + + "NumericField, TimeField, DateField, DateTimeField , GeographyField, RecordField.BytesField, RecordField.BooleanField, IntegerArrayField from " + + TABLE_ID_FASTQUERY_BQ_RESULTSET.getTable() + + " order by TimestampField"; + /* + Column Index mapping for ref: + StringField, 0 BigNumericField, 1 BooleanField, 2 BytesField, 3 IntegerField, 4 TimestampField, 5 FloatField, " 6 + NumericField, 7 TimeField, 8 DateField, 9 DateTimeField , 10 GeographyField, 11 RecordField.BytesField, 12 RecordField.BooleanField, 13 IntegerArrayField 14 + */ + ConnectionSettings connectionSettings = + ConnectionSettings.newBuilder().setDefaultDataset(DatasetId.of(DATASET)).build(); + Connection connection = bigquery.createConnection(connectionSettings); + BigQueryResult bigQueryResult = connection.executeSelect(query); + ResultSet rs = bigQueryResult.getResultSet(); + Schema sc = bigQueryResult.getSchema(); + + assertEquals(BQ_RESULTSET_EXPECTED_SCHEMA, sc); // match the schema + assertEquals(2, bigQueryResult.getTotalRows()); // Expecting 2 rows + while (rs.next()) { + assertEquals(rs.getString(0), rs.getString("StringField")); + assertTrue(rs.getDouble(1) == rs.getDouble("BigNumericField")); + assertEquals(rs.getBoolean(2), rs.getBoolean("BooleanField")); + if (rs.getBytes(3) == null) { // both overloads should be null + assertEquals(rs.getBytes(3), rs.getBytes("BytesField")); + } else { // value in String representation should be the same + assertEquals( + new String(rs.getBytes(3), StandardCharsets.UTF_8), + new String(rs.getBytes("BytesField"), StandardCharsets.UTF_8)); + } + assertEquals(rs.getInt(4), rs.getInt("IntegerField")); + assertEquals(rs.getTimestamp(5), rs.getTimestamp("TimestampField")); + assertEquals(rs.getDate(9), rs.getDate("DateField")); + assertTrue(rs.getDouble("FloatField") == rs.getDouble(6)); + assertTrue(rs.getDouble("NumericField") == rs.getDouble(7)); + assertEquals(rs.getTime(8), rs.getTime("TimeField")); + assertEquals(rs.getString(10), rs.getString("DateTimeField")); + assertEquals(rs.getString(11), rs.getString("GeographyField")); + if (rs.getBytes(12) == null) { // both overloads should be null + assertEquals(rs.getBytes(12), rs.getBytes("BytesField_1")); + } else { // value in String representation should be the same + assertEquals( + new String(rs.getBytes(12), StandardCharsets.UTF_8), + new String(rs.getBytes("BytesField_1"), StandardCharsets.UTF_8)); + } + assertEquals(rs.getBoolean(13), rs.getBoolean("BooleanField_1")); + assertTrue( + rs.getObject("IntegerArrayField") instanceof com.google.cloud.bigquery.FieldValueList); + FieldValueList integerArrayFieldValue = + (com.google.cloud.bigquery.FieldValueList) rs.getObject("IntegerArrayField"); + assertTrue(rs.getObject(14) instanceof com.google.cloud.bigquery.FieldValueList); + FieldValueList integerArrayFieldValueColInd = + (com.google.cloud.bigquery.FieldValueList) rs.getObject(14); + assertEquals( + integerArrayFieldValue.size(), + integerArrayFieldValueColInd.size()); // Array has 4 elements + if (integerArrayFieldValue.size() == 4) { // as we are picking the third index + assertEquals( + (integerArrayFieldValue.get(2).getNumericValue()).intValue(), + (integerArrayFieldValueColInd.get(2).getNumericValue()).intValue()); + } + } + } + + @Test + public void testExecuteSelectStruct() throws SQLException { + String query = "select (STRUCT(\"Vancouver\" as city, 5 as years)) as address"; + ConnectionSettings connectionSettings = + ConnectionSettings.newBuilder().setDefaultDataset(DatasetId.of(DATASET)).build(); + Connection connection = bigquery.createConnection(connectionSettings); + BigQueryResult bigQueryResult = connection.executeSelect(query); + assertEquals(1, bigQueryResult.getTotalRows()); + + Schema schema = bigQueryResult.getSchema(); + assertEquals("address", schema.getFields().get(0).getName()); + assertEquals(Field.Mode.NULLABLE, schema.getFields().get(0).getMode()); + // Backend is currently returning LegacySQLTypeName. Tracking bug: b/202977620 + assertEquals(LegacySQLTypeName.RECORD, schema.getFields().get(0).getType()); + assertEquals("city", schema.getFields().get(0).getSubFields().get(0).getName()); + assertEquals( + LegacySQLTypeName.STRING, schema.getFields().get(0).getSubFields().get(0).getType()); + assertEquals(Field.Mode.NULLABLE, schema.getFields().get(0).getSubFields().get(0).getMode()); + assertEquals("years", schema.getFields().get(0).getSubFields().get(1).getName()); + assertEquals( + LegacySQLTypeName.INTEGER, schema.getFields().get(0).getSubFields().get(1).getType()); + assertEquals(Field.Mode.NULLABLE, schema.getFields().get(0).getSubFields().get(1).getMode()); + + ResultSet rs = bigQueryResult.getResultSet(); + assertTrue(rs.next()); + FieldValueList addressFieldValue = + (com.google.cloud.bigquery.FieldValueList) rs.getObject("address"); + assertEquals(rs.getObject("address"), rs.getObject(0)); + assertEquals("Vancouver", addressFieldValue.get(0).getStringValue()); + assertEquals(5, addressFieldValue.get(1).getLongValue()); + assertFalse(rs.next()); // only 1 row of data + } + + @Test + public void testExecuteSelectStructSubField() throws SQLException { + String query = + "select address.city from (select (STRUCT(\"Vancouver\" as city, 5 as years)) as address)"; + ConnectionSettings connectionSettings = + ConnectionSettings.newBuilder().setDefaultDataset(DatasetId.of(DATASET)).build(); + Connection connection = bigquery.createConnection(connectionSettings); + BigQueryResult bigQueryResult = connection.executeSelect(query); + assertEquals(1, bigQueryResult.getTotalRows()); + + Schema schema = bigQueryResult.getSchema(); + assertEquals("city", schema.getFields().get(0).getName()); + assertEquals(Field.Mode.NULLABLE, schema.getFields().get(0).getMode()); + // Backend is currently returning LegacySQLTypeName. Tracking bug: b/202977620 + assertEquals(LegacySQLTypeName.STRING, schema.getFields().get(0).getType()); + assertNull( + schema.getFields().get(0).getSubFields()); // this is a String field without any subfields + + ResultSet rs = bigQueryResult.getResultSet(); + assertTrue(rs.next()); + String cityFieldValue = rs.getString("city"); + assertEquals(rs.getString("city"), rs.getObject(0)); + assertEquals("Vancouver", cityFieldValue); + assertFalse(rs.next()); // only 1 row of data + } + + @Test + public void testExecuteSelectArray() throws SQLException { + String query = "SELECT [1,2,3]"; + ConnectionSettings connectionSettings = + ConnectionSettings.newBuilder().setDefaultDataset(DatasetId.of(DATASET)).build(); + Connection connection = bigquery.createConnection(connectionSettings); + BigQueryResult bigQueryResult = connection.executeSelect(query); + assertEquals(1, bigQueryResult.getTotalRows()); + + Schema schema = bigQueryResult.getSchema(); + assertEquals("f0_", schema.getFields().get(0).getName()); + assertEquals(Field.Mode.REPEATED, schema.getFields().get(0).getMode()); + assertEquals(LegacySQLTypeName.INTEGER, schema.getFields().get(0).getType()); + assertNull(schema.getFields().get(0).getSubFields()); // no subfields for Integers + + ResultSet rs = bigQueryResult.getResultSet(); + assertTrue(rs.next()); + FieldValueList arrayFieldValue = (com.google.cloud.bigquery.FieldValueList) rs.getObject(0); + assertEquals(1, arrayFieldValue.get(0).getLongValue()); + assertEquals(2, arrayFieldValue.get(1).getLongValue()); + assertEquals(3, arrayFieldValue.get(2).getLongValue()); + } + + @Test + public void testExecuteSelectArrayOfStruct() throws SQLException { + String query = + "SELECT [STRUCT(\"Vancouver\" as city, 5 as years), STRUCT(\"Boston\" as city, 10 as years)]"; + ConnectionSettings connectionSettings = + ConnectionSettings.newBuilder().setDefaultDataset(DatasetId.of(DATASET)).build(); + Connection connection = bigquery.createConnection(connectionSettings); + BigQueryResult bigQueryResult = connection.executeSelect(query); + assertEquals(1, bigQueryResult.getTotalRows()); + + Schema schema = bigQueryResult.getSchema(); + assertEquals("f0_", schema.getFields().get(0).getName()); + assertEquals(Field.Mode.REPEATED, schema.getFields().get(0).getMode()); + // Backend is currently returning LegacySQLTypeName. Tracking bug: b/202977620 + // Verify the field metadata of the two subfields of the struct + assertEquals(LegacySQLTypeName.RECORD, schema.getFields().get(0).getType()); + assertEquals("city", schema.getFields().get(0).getSubFields().get(0).getName()); + assertEquals( + LegacySQLTypeName.STRING, schema.getFields().get(0).getSubFields().get(0).getType()); + assertEquals(Field.Mode.NULLABLE, schema.getFields().get(0).getSubFields().get(0).getMode()); + assertEquals("years", schema.getFields().get(0).getSubFields().get(1).getName()); + assertEquals( + LegacySQLTypeName.INTEGER, schema.getFields().get(0).getSubFields().get(1).getType()); + assertEquals(Field.Mode.NULLABLE, schema.getFields().get(0).getSubFields().get(1).getMode()); + + ResultSet rs = bigQueryResult.getResultSet(); + assertTrue(rs.next()); + FieldValueList arrayOfStructFieldValue = + (com.google.cloud.bigquery.FieldValueList) rs.getObject(0); + // Verify the values of the two structs in the array + assertEquals(Attribute.RECORD, arrayOfStructFieldValue.get(0).getAttribute()); + assertEquals( + "Vancouver", arrayOfStructFieldValue.get(0).getRecordValue().get(0).getStringValue()); + assertEquals(5, arrayOfStructFieldValue.get(0).getRecordValue().get(1).getLongValue()); + assertEquals(Attribute.RECORD, arrayOfStructFieldValue.get(1).getAttribute()); + assertEquals("Boston", arrayOfStructFieldValue.get(1).getRecordValue().get(0).getStringValue()); + assertEquals(10, arrayOfStructFieldValue.get(1).getRecordValue().get(1).getLongValue()); + } + + /* TODO(prasmish): replicate the entire test case for executeSelect */ @Test public void testFastQueryMultipleRuns() throws InterruptedException { String query = @@ -2265,6 +2981,7 @@ public void testFastQueryMultipleRuns() throws InterruptedException { assertFalse(result2.hasNextPage()); } + /* TODO(prasmish): replicate the entire test case for executeSelect */ @Test public void testFastQuerySinglePageDuplicateRequestIds() throws InterruptedException { String query = @@ -2294,6 +3011,7 @@ public void testFastQuerySinglePageDuplicateRequestIds() throws InterruptedExcep assertFalse(result2.hasNextPage()); } + /* TODO(prasmish): replicate the entire test case for executeSelect */ @Test public void testFastSQLQuery() throws InterruptedException { String query = @@ -2323,6 +3041,7 @@ public void testFastSQLQuery() throws InterruptedException { } } + /* TODO(prasmish): replicate the entire test case for executeSelect */ @Test public void testFastSQLQueryMultiPage() throws InterruptedException { String query = @@ -2449,6 +3168,7 @@ public void testFastQuerySlowDDL() throws InterruptedException { } } + /* TODO(prasmish): replicate the entire test case for executeSelect */ @Test public void testFastQueryHTTPException() throws InterruptedException { String queryInvalid = @@ -2511,10 +3231,36 @@ public void testQuerySessionSupport() throws InterruptedException { remoteJobWithSession = remoteJobWithSession.waitFor(); assertNull(remoteJobWithSession.getStatus().getError()); Job queryJobWithSession = bigquery.getJob(remoteJobWithSession.getJobId()); - JobStatistics.QueryStatistics statisticsWithSession = queryJobWithSession.getStatistics(); + QueryStatistics statisticsWithSession = queryJobWithSession.getStatistics(); assertEquals(sessionId, statisticsWithSession.getSessionInfo().getSessionId()); } + // TODO: uncomment this testcase when executeUpdate is implemented + // @Test + // public void testExecuteSelectWithSession() throws BigQuerySQLException { + // String query = "CREATE TEMPORARY TABLE temptable AS SELECT 17 as foo"; + // ConnectionSettings connectionSettings = + // ConnectionSettings.newBuilder().setDefaultDataset(DatasetId.of(DATASET)).setCreateSession(true).build(); + // Connection connection = bigquery.createConnection(connectionSettings); + // BigQueryResult bigQueryResult = connection.execute(query); + // BigQueryResultStats stats = bigQueryResult.getBigQueryResultStats(); + // assertNotNull(stats.getSessionInfo().getSessionId()); + // } + + @Test + public void testExecuteSelectSessionSupport() throws BigQuerySQLException { + String query = "SELECT 17 as foo"; + ConnectionSettings connectionSettings = + ConnectionSettings.newBuilder() + .setDefaultDataset(DatasetId.of(DATASET)) + .setCreateSession(true) + .build(); + Connection connection = bigquery.createConnection(connectionSettings); + BigQueryResult bigQueryResult = connection.executeSelect(query); + String sessionId = bigQueryResult.getBigQueryResultStats().getSessionInfo().getSessionId(); + assertNotNull(sessionId); + } + @Test public void testDmlStatistics() throws InterruptedException { String tableName = TABLE_ID_FASTQUERY.getTable(); @@ -2535,6 +3281,7 @@ public void testDmlStatistics() throws InterruptedException { assertEquals(2L, statistics.getDmlStats().getUpdatedRowCount().longValue()); } + /* TODO(prasmish): replicate the entire test case for executeSelect */ @Test public void testTransactionInfo() throws InterruptedException { String tableName = TABLE_ID_FASTQUERY.getTable(); @@ -2556,6 +3303,7 @@ public void testTransactionInfo() throws InterruptedException { } } + /* TODO(prasmish): replicate the entire test case for executeSelect */ @Test public void testScriptStatistics() throws InterruptedException { String script = @@ -2706,6 +3454,27 @@ public void testPositionalQueryParameters() throws InterruptedException { } } + /* TODO(prasmish): expand below test case with all the fields shown in the above test case */ + @Test + public void testExecuteSelectWithPositionalQueryParameters() throws BigQuerySQLException { + String query = + "SELECT TimestampField, StringField FROM " + + TABLE_ID.getTable() + + " WHERE StringField = ?" + + " AND TimestampField > ?"; + QueryParameterValue stringParameter = QueryParameterValue.string("stringValue"); + QueryParameterValue timestampParameter = + QueryParameterValue.timestamp("2014-01-01 07:00:00.000000+00:00"); + Parameter stringParam = Parameter.newBuilder().setValue(stringParameter).build(); + Parameter timeStampParam = Parameter.newBuilder().setValue(timestampParameter).build(); + ConnectionSettings connectionSettings = + ConnectionSettings.newBuilder().setDefaultDataset(DatasetId.of(DATASET)).build(); + Connection connection = bigquery.createConnection(connectionSettings); + List parameters = ImmutableList.of(stringParam, timeStampParam); + BigQueryResult rs = connection.executeSelect(query, parameters); + assertEquals(2, rs.getTotalRows()); + } + @Test public void testNamedQueryParameters() throws InterruptedException { String query = @@ -2728,6 +3497,30 @@ public void testNamedQueryParameters() throws InterruptedException { assertEquals(2, Iterables.size(result.getValues())); } + @Test + public void testExecuteSelectWithNamedQueryParameters() throws BigQuerySQLException { + String query = + "SELECT TimestampField, StringField, BooleanField FROM " + + TABLE_ID.getTable() + + " WHERE StringField = @stringParam" + + " AND IntegerField IN UNNEST(@integerList)"; + QueryParameterValue stringParameter = QueryParameterValue.string("stringValue"); + QueryParameterValue intArrayParameter = + QueryParameterValue.array(new Integer[] {3, 4}, Integer.class); + Parameter stringParam = + Parameter.newBuilder().setName("stringParam").setValue(stringParameter).build(); + Parameter intArrayParam = + Parameter.newBuilder().setName("integerList").setValue(intArrayParameter).build(); + + ConnectionSettings connectionSettings = + ConnectionSettings.newBuilder().setDefaultDataset(DatasetId.of(DATASET)).build(); + Connection connection = bigquery.createConnection(connectionSettings); + List parameters = ImmutableList.of(stringParam, intArrayParam); + BigQueryResult rs = connection.executeSelect(query, parameters); + assertEquals(2, rs.getTotalRows()); + } + + /* TODO(prasmish): replicate relevant parts of the test case for executeSelect */ @Test public void testStructNamedQueryParameters() throws InterruptedException { QueryParameterValue booleanValue = QueryParameterValue.bool(true); @@ -2782,6 +3575,7 @@ private static void assertsFieldValue(FieldValue record) { assertEquals("test-stringField", record.getRecordValue().get("stringField").getStringValue()); } + /* TODO(prasmish): replicate relevant parts of the test case for executeSelect */ @Test public void testNestedStructNamedQueryParameters() throws InterruptedException { QueryParameterValue booleanValue = QueryParameterValue.bool(true); @@ -2825,6 +3619,7 @@ public void testNestedStructNamedQueryParameters() throws InterruptedException { } } + /* TODO(prasmish): replicate relevant parts of the test case for executeSelect */ @Test public void testBytesParameter() throws Exception { String query = "SELECT BYTE_LENGTH(@p) AS length"; @@ -3108,6 +3903,7 @@ public void testCopyJobWithLabels() throws InterruptedException { assertTrue(remoteTable.delete()); } + /* TODO(prasmish): replicate the entire test case for executeSelect */ @Test public void testQueryJob() throws InterruptedException, TimeoutException { String tableName = "test_query_job_table"; @@ -3152,6 +3948,7 @@ public void testQueryJob() throws InterruptedException, TimeoutException { assertNotNull(statistics.getQueryPlan()); } + /* TODO(prasmish): replicate the entire test case for executeSelect */ @Test public void testQueryJobWithConnectionProperties() throws InterruptedException { String tableName = "test_query_job_table_connection_properties"; @@ -3171,6 +3968,7 @@ public void testQueryJobWithConnectionProperties() throws InterruptedException { assertTrue(bigquery.delete(destinationTable)); } + /* TODO(prasmish): replicate the entire test case for executeSelect */ @Test public void testQueryJobWithLabels() throws InterruptedException, TimeoutException { String tableName = "test_query_job_table"; @@ -3194,6 +3992,7 @@ public void testQueryJobWithLabels() throws InterruptedException, TimeoutExcepti } } + /* TODO(prasmish): replicate the entire test case for executeSelect */ @Test public void testQueryJobWithRangePartitioning() throws InterruptedException { String tableName = "test_query_job_table_rangepartitioning"; @@ -3304,6 +4103,7 @@ public void testQueryJobWithDryRun() throws InterruptedException, TimeoutExcepti .build(); Job remoteJob = bigquery.create(JobInfo.of(configuration)); assertNull(remoteJob.getJobId().getJob()); + remoteJob.getStatistics(); assertEquals(DONE, remoteJob.getStatus().getState()); assertNotNull(remoteJob.getConfiguration()); } diff --git a/google-cloud-bigquery/src/test/java/com/google/cloud/bigquery/it/ITNightlyBigQueryTest.java b/google-cloud-bigquery/src/test/java/com/google/cloud/bigquery/it/ITNightlyBigQueryTest.java new file mode 100644 index 000000000..d672967b1 --- /dev/null +++ b/google-cloud-bigquery/src/test/java/com/google/cloud/bigquery/it/ITNightlyBigQueryTest.java @@ -0,0 +1,610 @@ +/* + * Copyright 2022 Google LLC + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package com.google.cloud.bigquery.it; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertNotNull; +import static org.junit.Assert.assertNull; +import static org.junit.Assert.assertTrue; +import static org.junit.Assert.fail; + +import com.google.cloud.bigquery.BigQuery; +import com.google.cloud.bigquery.BigQueryError; +import com.google.cloud.bigquery.BigQueryException; +import com.google.cloud.bigquery.BigQueryResult; +import com.google.cloud.bigquery.BigQuerySQLException; +import com.google.cloud.bigquery.Connection; +import com.google.cloud.bigquery.ConnectionSettings; +import com.google.cloud.bigquery.Dataset; +import com.google.cloud.bigquery.DatasetId; +import com.google.cloud.bigquery.DatasetInfo; +import com.google.cloud.bigquery.Field; +import com.google.cloud.bigquery.InsertAllRequest; +import com.google.cloud.bigquery.InsertAllResponse; +import com.google.cloud.bigquery.Parameter; +import com.google.cloud.bigquery.QueryParameterValue; +import com.google.cloud.bigquery.Schema; +import com.google.cloud.bigquery.StandardSQLTypeName; +import com.google.cloud.bigquery.StandardTableDefinition; +import com.google.cloud.bigquery.Table; +import com.google.cloud.bigquery.TableDefinition; +import com.google.cloud.bigquery.TableId; +import com.google.cloud.bigquery.TableInfo; +import com.google.cloud.bigquery.testing.RemoteBigQueryHelper; +import com.google.common.collect.ImmutableList; +import com.google.common.io.BaseEncoding; +import java.io.IOException; +import java.math.BigDecimal; +import java.nio.charset.StandardCharsets; +import java.sql.ResultSet; +import java.sql.SQLException; +import java.sql.Time; +import java.time.LocalTime; +import java.time.ZoneId; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.TimeZone; +import java.util.concurrent.ExecutionException; +import java.util.logging.Level; +import java.util.logging.Logger; +import org.apache.arrow.vector.util.JsonStringArrayList; +import org.junit.AfterClass; +import org.junit.BeforeClass; +import org.junit.Rule; +import org.junit.Test; +import org.junit.rules.Timeout; + +public class ITNightlyBigQueryTest { + private static final Logger logger = Logger.getLogger(ITNightlyBigQueryTest.class.getName()); + private static final String DATASET = RemoteBigQueryHelper.generateDatasetName(); + private static final String TABLE = "TEMP_RS_TEST_TABLE"; + private static final byte[] BYTES = "TestByteValue".getBytes(StandardCharsets.UTF_8); + private static final String BYTES_BASE64 = BaseEncoding.base64().encode(BYTES); + // Script will populate NUM_BATCHES*REC_PER_BATCHES number of records (eg: 100*10000 = 1M) + private static final int NUM_BATCHES = 55; + private static final int REC_PER_BATCHES = 10000; + private static final int LIMIT_RECS = 500000; // We can plan to read ~ 500K / 1M records + private static final int MULTI_LIMIT_RECS = + 300000; // Used for multiquery testcase, a lower limit like 300K should be fine + private static int rowCnt = 0; + private static BigQuery bigquery; + private static final String BASE_QUERY = + "select StringField, GeographyField, BooleanField, BigNumericField, IntegerField, NumericField, BytesField, " + + "TimestampField, TimeField, DateField, IntegerArrayField, RecordField.BooleanField, RecordField.StringField ," + + " JSONField, JSONField.hello, JSONField.id from %s.%s order by IntegerField asc LIMIT %s"; + private static final String POSITIONAL_QUERY = + String.format( + "select RecordField.BooleanField, RecordField.StringField, StringField, BooleanField, BytesField, IntegerField, GeographyField, NumericField, BigNumericField, TimeField, DateField, TimestampField, JSONField from %s.%s where DateField = ? and BooleanField = ? and IntegerField > ? and NumericField > ? LIMIT %s", + DATASET, TABLE, MULTI_LIMIT_RECS); + private static final String QUERY = String.format(BASE_QUERY, DATASET, TABLE, LIMIT_RECS); + private static final String MULTI_QUERY = + String.format(BASE_QUERY, DATASET, TABLE, MULTI_LIMIT_RECS); + private static final String INVALID_QUERY = + String.format( + "select into %s.%s order by IntegerField asc LIMIT %s", DATASET, TABLE, LIMIT_RECS); + + private static final Schema BQ_SCHEMA = + Schema.of( + Field.newBuilder("TimestampField", StandardSQLTypeName.TIMESTAMP) + .setMode(Field.Mode.NULLABLE) + .setDescription("TimestampDescription") + .build(), + Field.newBuilder("StringField", StandardSQLTypeName.STRING) + .setMode(Field.Mode.NULLABLE) + .setDescription("StringDescription") + .build(), + Field.newBuilder("IntegerArrayField", StandardSQLTypeName.NUMERIC) + .setMode(Field.Mode.REPEATED) + .setDescription("IntegerArrayDescription") + .build(), + Field.newBuilder("BooleanField", StandardSQLTypeName.BOOL) + .setMode(Field.Mode.NULLABLE) + .setDescription("BooleanDescription") + .build(), + Field.newBuilder("BytesField", StandardSQLTypeName.BYTES) + .setMode(Field.Mode.NULLABLE) + .setDescription("BytesDescription") + .build(), + Field.newBuilder( + "RecordField", + StandardSQLTypeName.STRUCT, + Field.newBuilder("StringField", StandardSQLTypeName.STRING) + .setMode(Field.Mode.NULLABLE) + .setDescription("StringDescription") + .build(), + Field.newBuilder("BooleanField", StandardSQLTypeName.BOOL) + .setMode(Field.Mode.NULLABLE) + .setDescription("BooleanDescription") + .build()) + .setMode(Field.Mode.NULLABLE) + .setDescription("RecordDescription") + .build(), + Field.newBuilder("IntegerField", StandardSQLTypeName.NUMERIC) + .setMode(Field.Mode.NULLABLE) + .setDescription("IntegerDescription") + .build(), + Field.newBuilder("GeographyField", StandardSQLTypeName.GEOGRAPHY) + .setMode(Field.Mode.NULLABLE) + .setDescription("GeographyDescription") + .build(), + Field.newBuilder("NumericField", StandardSQLTypeName.NUMERIC) + .setMode(Field.Mode.NULLABLE) + .setDescription("NumericDescription") + .build(), + Field.newBuilder("BigNumericField", StandardSQLTypeName.BIGNUMERIC) + .setMode(Field.Mode.NULLABLE) + .setDescription("BigNumericDescription") + .build(), + Field.newBuilder("TimeField", StandardSQLTypeName.TIME) + .setMode(Field.Mode.NULLABLE) + .setDescription("TimeDescription") + .build(), + Field.newBuilder("DateField", StandardSQLTypeName.DATE) + .setMode(Field.Mode.NULLABLE) + .setDescription("DateDescription") + .build(), + Field.newBuilder("JSONField", StandardSQLTypeName.JSON) + .setMode(Field.Mode.NULLABLE) + .setDescription("JSONFieldDescription") + .build(), + Field.newBuilder("IntervalField", StandardSQLTypeName.INTERVAL) + .setMode(Field.Mode.NULLABLE) + .setDescription("IntervalFieldDescription") + .build()); + + @Rule public Timeout globalTimeout = Timeout.seconds(1800); // setting 30 mins as the timeout + + @BeforeClass + public static void beforeClass() throws InterruptedException, IOException { + RemoteBigQueryHelper bigqueryHelper = RemoteBigQueryHelper.create(); + bigquery = bigqueryHelper.getOptions().getService(); + createDataset(DATASET); + createTable(DATASET, TABLE, BQ_SCHEMA); + populateTestRecords(DATASET, TABLE); + } + + @AfterClass + public static void afterClass() throws ExecutionException, InterruptedException { + try { + if (bigquery != null) { + deleteTable(DATASET, TABLE); + RemoteBigQueryHelper.forceDelete(bigquery, DATASET); + } else { + fail("Error clearing the test dataset"); + } + } catch (BigQueryException e) { + fail("Error clearing the test dataset " + e); + } + } + + @Test + public void testInvalidQuery() throws BigQuerySQLException { + Connection connection = getConnection(); + try { + BigQueryResult bigQueryResult = connection.executeSelect(INVALID_QUERY); + fail("BigQuerySQLException was expected"); + } catch (BigQuerySQLException ex) { + assertNotNull(ex.getMessage()); + assertTrue(ex.getMessage().toLowerCase().contains("unexpected keyword into")); + } finally { + connection.close(); + } + } + + /* + This tests for the order of the records as well as the value of the records using testForAllDataTypeValues + */ + @Test + public void testIterateAndOrder() throws SQLException { + Connection connection = getConnection(); + BigQueryResult bigQueryResult = connection.executeSelect(QUERY); + logger.log(Level.INFO, "Query used: {0}", QUERY); + ResultSet rs = bigQueryResult.getResultSet(); + int cnt = 0; + + int prevIntegerFieldVal = 0; + while (rs.next()) { + if (cnt == 0) { // first row is supposed to be null + assertNull(rs.getString("StringField")); + assertNull(rs.getString("GeographyField")); + Object intAryField = rs.getObject("IntegerArrayField"); + if (intAryField instanceof JsonStringArrayList) { + assertEquals( + new JsonStringArrayList(), + ((JsonStringArrayList) intAryField)); // null array is returned as an empty array + } + assertFalse(rs.getBoolean("BooleanField")); + assertTrue(0.0d == rs.getDouble("BigNumericField")); + assertTrue(0 == rs.getInt("IntegerField")); + assertTrue(0L == rs.getLong("NumericField")); + assertNull(rs.getBytes("BytesField")); + assertNull(rs.getTimestamp("TimestampField")); + assertNull(rs.getTime("TimeField")); + assertNull(rs.getDate("DateField")); + assertNull(rs.getString("JSONField")); + assertFalse(rs.getBoolean("BooleanField_1")); + assertNull(rs.getString("StringField_1")); + assertNull(rs.getString("hello")); // equivalent of testJsonType + assertEquals(0, rs.getInt("id")); + + } else { // remaining rows are supposed to be non null + assertNotNull(rs.getString("StringField")); + assertNotNull(rs.getString("GeographyField")); + assertNotNull(rs.getObject("IntegerArrayField")); + assertTrue(rs.getBoolean("BooleanField")); + assertTrue(0.0d < rs.getDouble("BigNumericField")); + assertTrue(0 < rs.getInt("IntegerField")); + assertTrue(0L < rs.getLong("NumericField")); + assertNotNull(rs.getBytes("BytesField")); + assertNotNull(rs.getTimestamp("TimestampField")); + assertNotNull(rs.getTime("TimeField")); + assertNotNull(rs.getDate("DateField")); + assertNotNull(rs.getString("JSONField")); + assertFalse(rs.getBoolean("BooleanField_1")); + assertNotNull(rs.getString("StringField_1")); + + // check the order of the records + assertTrue(prevIntegerFieldVal < rs.getInt("IntegerField")); + prevIntegerFieldVal = rs.getInt("IntegerField"); + + testForAllDataTypeValues(rs, cnt); // asserts the value of each row + } + ++cnt; + } + assertEquals(LIMIT_RECS, cnt); // all the records were retrieved + connection.close(); + } + + /* + This tests for the order of the records using default connection settings as well as the value of the records using testForAllDataTypeValues + */ + @Test + public void testIterateAndOrderDefaultConnSettings() throws SQLException { + Connection connection = bigquery.createConnection(); + BigQueryResult bigQueryResult = connection.executeSelect(QUERY); + logger.log(Level.INFO, "Query used: {0}", QUERY); + ResultSet rs = bigQueryResult.getResultSet(); + int cnt = 0; + + int prevIntegerFieldVal = 0; + while (rs.next()) { + if (cnt == 0) { // first row is supposed to be null + assertNull(rs.getString("StringField")); + assertNull(rs.getString("GeographyField")); + Object intAryField = rs.getObject("IntegerArrayField"); + if (intAryField instanceof JsonStringArrayList) { + assertEquals( + new JsonStringArrayList(), + ((JsonStringArrayList) intAryField)); // null array is returned as an empty array + } + assertFalse(rs.getBoolean("BooleanField")); + assertTrue(0.0d == rs.getDouble("BigNumericField")); + assertTrue(0 == rs.getInt("IntegerField")); + assertTrue(0L == rs.getLong("NumericField")); + assertNull(rs.getBytes("BytesField")); + assertNull(rs.getTimestamp("TimestampField")); + assertNull(rs.getTime("TimeField")); + assertNull(rs.getDate("DateField")); + assertNull(rs.getString("JSONField")); + assertFalse(rs.getBoolean("BooleanField_1")); + assertNull(rs.getString("StringField_1")); + assertNull(rs.getString("hello")); // equivalent of testJsonType + assertEquals(0, rs.getInt("id")); + + } else { // remaining rows are supposed to be non null + assertNotNull(rs.getString("StringField")); + assertNotNull(rs.getString("GeographyField")); + assertNotNull(rs.getObject("IntegerArrayField")); + assertTrue(rs.getBoolean("BooleanField")); + assertTrue(0.0d < rs.getDouble("BigNumericField")); + assertTrue(0 < rs.getInt("IntegerField")); + assertTrue(0L < rs.getLong("NumericField")); + assertNotNull(rs.getBytes("BytesField")); + assertNotNull(rs.getTimestamp("TimestampField")); + assertNotNull(rs.getTime("TimeField")); + assertNotNull(rs.getDate("DateField")); + assertNotNull(rs.getString("JSONField")); + assertFalse(rs.getBoolean("BooleanField_1")); + assertNotNull(rs.getString("StringField_1")); + + // check the order of the records + assertTrue(prevIntegerFieldVal < rs.getInt("IntegerField")); + prevIntegerFieldVal = rs.getInt("IntegerField"); + + testForAllDataTypeValues(rs, cnt); // asserts the value of each row + } + ++cnt; + } + assertEquals(LIMIT_RECS, cnt); // all the records were retrieved + connection.close(); + } + + @Test + public void testMultipleRuns() throws SQLException { + + Connection connection = getConnection(); + BigQueryResult bigQueryResult = connection.executeSelect(MULTI_QUERY); + logger.log(Level.INFO, "Query used: {0}", MULTI_QUERY); + ResultSet rs = bigQueryResult.getResultSet(); + int cnt = 0; + int totalCnt = 0; + + int prevIntegerFieldVal = 0; + while (rs.next()) { + if (cnt == 0) { // first row is supposed to be null + assertNull(rs.getString("StringField")); + assertNull(rs.getString("GeographyField")); + Object intAryField = rs.getObject("IntegerArrayField"); + if (intAryField instanceof JsonStringArrayList) { + assertEquals( + new JsonStringArrayList(), + ((JsonStringArrayList) intAryField)); // null array is returned as an empty array + } + assertFalse(rs.getBoolean("BooleanField")); + assertTrue(0.0d == rs.getDouble("BigNumericField")); + assertTrue(0 == rs.getInt("IntegerField")); + assertTrue(0L == rs.getLong("NumericField")); + assertNull(rs.getBytes("BytesField")); + assertNull(rs.getTimestamp("TimestampField")); + assertNull(rs.getTime("TimeField")); + assertNull(rs.getDate("DateField")); + assertNull(rs.getString("JSONField")); + assertFalse(rs.getBoolean("BooleanField_1")); + assertNull(rs.getString("StringField_1")); + assertNull(rs.getString("hello")); // equivalent of testJsonType + assertEquals(0, rs.getInt("id")); + + } else { // remaining rows are supposed to be non null + // check the order of the records + assertTrue(prevIntegerFieldVal < rs.getInt("IntegerField")); + prevIntegerFieldVal = rs.getInt("IntegerField"); + + testForAllDataTypeValues(rs, cnt); // asserts the value of each row + } + ++cnt; + } + connection.close(); + totalCnt += cnt; + // Repeat the same run + connection = getConnection(); + bigQueryResult = connection.executeSelect(MULTI_QUERY); + rs = bigQueryResult.getResultSet(); + cnt = 0; + prevIntegerFieldVal = 0; + while (rs.next()) { + if (cnt == 0) { // first row is supposed to be null + assertNull(rs.getString("StringField")); + assertNull(rs.getString("GeographyField")); + Object intAryField = rs.getObject("IntegerArrayField"); + if (intAryField instanceof JsonStringArrayList) { + assertEquals( + new JsonStringArrayList(), + ((JsonStringArrayList) intAryField)); // null array is returned as an empty array + } + assertFalse(rs.getBoolean("BooleanField")); + assertTrue(0.0d == rs.getDouble("BigNumericField")); + assertTrue(0 == rs.getInt("IntegerField")); + assertTrue(0L == rs.getLong("NumericField")); + assertNull(rs.getBytes("BytesField")); + assertNull(rs.getTimestamp("TimestampField")); + assertNull(rs.getTime("TimeField")); + assertNull(rs.getDate("DateField")); + assertNull(rs.getString("JSONField")); + assertFalse(rs.getBoolean("BooleanField_1")); + assertNull(rs.getString("StringField_1")); + assertNull(rs.getString("hello")); // equivalent of testJsonType + assertEquals(0, rs.getInt("id")); + + } else { // remaining rows are supposed to be non null + // check the order of the records + assertTrue(prevIntegerFieldVal < rs.getInt("IntegerField")); + prevIntegerFieldVal = rs.getInt("IntegerField"); + + testForAllDataTypeValues(rs, cnt); // asserts the value of each row + } + ++cnt; + } + connection.close(); + totalCnt += cnt; + assertEquals(MULTI_LIMIT_RECS * 2, totalCnt); + } + + @Test + public void testPositionalParams() + throws SQLException { // Bypasses Read API as it doesnt support Positional Params + Connection connection = getConnection(); + Parameter dateParam = + Parameter.newBuilder().setValue(QueryParameterValue.date("2022-01-01")).build(); + Parameter boolParam = Parameter.newBuilder().setValue(QueryParameterValue.bool(true)).build(); + Parameter intParam = Parameter.newBuilder().setValue(QueryParameterValue.int64(1)).build(); + Parameter numericParam = + Parameter.newBuilder().setValue(QueryParameterValue.numeric(new BigDecimal(100))).build(); + List parameters = ImmutableList.of(dateParam, boolParam, intParam, numericParam); + + BigQueryResult bigQueryResult = connection.executeSelect(POSITIONAL_QUERY, parameters); + logger.log(Level.INFO, "Query used: {0}", POSITIONAL_QUERY); + ResultSet rs = bigQueryResult.getResultSet(); + int cnt = 0; + while (rs.next()) { + assertFalse(rs.getBoolean("BooleanField")); + assertTrue(0.0d <= rs.getDouble("BigNumericField")); + assertTrue(0 <= rs.getInt("IntegerField")); + assertTrue(0L <= rs.getLong("NumericField")); + assertNotNull(rs.getBytes("BytesField")); + assertNotNull(rs.getTimestamp("TimestampField")); + assertNotNull(rs.getTime("TimeField")); + assertNotNull(rs.getDate("DateField")); + assertNotNull(rs.getString("JSONField")); + assertTrue(rs.getBoolean("BooleanField_1")); + assertNotNull(rs.getString("StringField_1")); + ++cnt; + } + connection.close(); + assertEquals(MULTI_LIMIT_RECS, cnt); + } + + // asserts the value of each row + private static void testForAllDataTypeValues(ResultSet rs, int cnt) throws SQLException { + // Testing JSON type + assertEquals("\"world\"", rs.getString("hello")); // BQ stores the value as "world" + assertEquals(100, rs.getInt("id")); + assertEquals("{\"hello\":\"world\",\"id\":100}", rs.getString("JSONField")); + + // String and Geography types + assertEquals(String.format("String Val %s", cnt), rs.getString("StringField")); + assertEquals("POINT(1 2)", rs.getString("GeographyField")); + + // Array type tests + if (rs.getObject("IntegerArrayField") instanceof JsonStringArrayList) { + JsonStringArrayList ary = (JsonStringArrayList) rs.getObject("IntegerArrayField"); + assertEquals(3, ary.size()); + assertEquals(1, ary.get(0).intValue()); + assertEquals(2, ary.get(1).intValue()); + assertEquals(3, ary.get(2).intValue()); + } + + // BigNumeric, int and Numeric + assertTrue(10000000L + cnt == rs.getDouble("BigNumericField")); + assertEquals(1 + cnt, rs.getInt("IntegerField")); + assertEquals(100 + cnt, rs.getLong("NumericField")); + // Test Byte field + assertEquals("TestByteValue", new String(rs.getBytes("BytesField"), StandardCharsets.UTF_8)); + + // Struct Fields + assertFalse(rs.getBoolean("BooleanField_1")); + assertEquals(String.format("Str Val %s", cnt), rs.getString("StringField_1")); + + // Timestamp, Time, DateTime and Date fields + assertEquals(1649064795000L, rs.getTimestamp("TimestampField").getTime()); + assertEquals( + java.sql.Date.valueOf("2022-01-01").toString(), rs.getDate("DateField").toString()); + // Time is represented independent of a specific date and timezone. For example a 12:11:35 (GMT) + // is returned as + // 17:11:35 (GMT+5:30) . So we need to adjust the offset + int offset = + TimeZone.getTimeZone(ZoneId.systemDefault()) + .getOffset(new java.util.Date().getTime()); // offset in seconds + assertEquals( + Time.valueOf(LocalTime.of(12, 11, 35)).getTime() + offset, + rs.getTime("TimeField").getTime()); + } + + private static void populateTestRecords(String datasetName, String tableName) { + TableId tableId = TableId.of(datasetName, tableName); + for (int batchCnt = 1; batchCnt <= NUM_BATCHES; batchCnt++) { + addBatchRecords(tableId); + } + } + + private static void addBatchRecords(TableId tableId) { + Map nullRow = new HashMap<>(); + try { + InsertAllRequest.Builder reqBuilder = InsertAllRequest.newBuilder(tableId); + if (rowCnt == 0) { + reqBuilder.addRow(nullRow); + } + for (int i = 0; i < REC_PER_BATCHES; i++) { + reqBuilder.addRow(getNextRow()); + } + InsertAllResponse response = bigquery.insertAll(reqBuilder.build()); + + if (response.hasErrors()) { + // If any of the insertions failed, this lets you inspect the errors + for (Map.Entry> entry : response.getInsertErrors().entrySet()) { + logger.log(Level.WARNING, "Exception while adding records {0}", entry.getValue()); + } + fail("Response has errors"); + } + } catch (BigQueryException e) { + logger.log(Level.WARNING, "Exception while adding records {0}", e); + fail("Error in addBatchRecords"); + } + } + + private static void createTable(String datasetName, String tableName, Schema schema) { + try { + TableId tableId = TableId.of(datasetName, tableName); + TableDefinition tableDefinition = StandardTableDefinition.of(schema); + TableInfo tableInfo = TableInfo.newBuilder(tableId, tableDefinition).build(); + Table table = bigquery.create(tableInfo); + assertTrue(table.exists()); + } catch (BigQueryException e) { + fail("Table was not created. \n" + e); + } + } + + public static void deleteTable(String datasetName, String tableName) { + try { + assertTrue(bigquery.delete(TableId.of(datasetName, tableName))); + } catch (BigQueryException e) { + fail("Table was not deleted. \n" + e); + } + } + + public static void createDataset(String datasetName) { + try { + DatasetInfo datasetInfo = DatasetInfo.newBuilder(datasetName).build(); + Dataset newDataset = bigquery.create(datasetInfo); + assertNotNull(newDataset.getDatasetId().getDataset()); + } catch (BigQueryException e) { + fail("Dataset was not created. \n" + e); + } + } + + public static void deleteDataset(String datasetName) { + try { + DatasetInfo datasetInfo = DatasetInfo.newBuilder(datasetName).build(); + assertTrue(bigquery.delete(datasetInfo.getDatasetId())); + } catch (BigQueryException e) { + fail("Dataset was not deleted. \n" + e); + } + } + + private Connection getConnection() { + + ConnectionSettings connectionSettings = + ConnectionSettings.newBuilder() + .setDefaultDataset(DatasetId.of(DATASET)) + .build(); // Read API is enabled by default + return bigquery.createConnection(connectionSettings); + } + + private static Map getNextRow() { + rowCnt++; + Map row = new HashMap<>(); + Map structVal = new HashMap<>(); + structVal.put("StringField", "Str Val " + rowCnt); + structVal.put("BooleanField", false); + row.put("RecordField", structVal); // struct + row.put("TimestampField", "2022-04-04 15:03:15.000 +05:30"); + row.put("StringField", "String Val " + rowCnt); + row.put("IntegerArrayField", new int[] {1, 2, 3}); + row.put("BooleanField", true); + row.put("BytesField", BYTES_BASE64); + row.put("IntegerField", 1 + rowCnt); + row.put("GeographyField", "POINT(1 2)"); + row.put("NumericField", 100 + rowCnt); + row.put("BigNumericField", 10000000L + rowCnt); + row.put("TimeField", "12:11:35"); + row.put("DateField", "2022-01-01"); + row.put("JSONField", "{\"hello\":\"world\",\"id\":100}"); + row.put("IntervalField", "10000-0 3660000 87840000:0:0"); + return row; + } +} diff --git a/pom.xml b/pom.xml index ad8867220..5283f06a9 100644 --- a/pom.xml +++ b/pom.xml @@ -66,6 +66,21 @@ pom import + + + + com.google.cloud + google-cloud-datacatalog-bom + 1.6.1 + pom + import + com.google.cloud @@ -107,6 +122,13 @@ 1.7.0 + + + org.threeten + threeten-extra + 1.7.0 + + junit @@ -148,10 +170,59 @@ + + + + com.google.protobuf + protobuf-java + + + com.google.cloud + google-cloud-bigquerystorage + 2.8.3 + + + com.google.api.grpc + proto-google-cloud-bigquerystorage-v1 + 2.8.3 + + + org.apache.arrow + arrow-vector + 7.0.0 + + + org.apache.arrow + arrow-memory-core + 7.0.0 + + + org.apache.arrow + arrow-memory-netty + 7.0.0 + runtime + + + google-cloud-bigquery + + + + + org.apache.maven.plugins + maven-dependency-plugin + + + org.apache.arrow:arrow-memory-netty + + + + + +