Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyspaceMetadata causes panic when describing aggregates #1587

Open
evildecay opened this issue Oct 22, 2021 · 8 comments · May be fixed by #1590
Open

KeyspaceMetadata causes panic when describing aggregates #1587

evildecay opened this issue Oct 22, 2021 · 8 comments · May be fixed by #1590
Labels

Comments

@evildecay
Copy link
Contributor

Please answer these questions before submitting your issue. Thanks!

What version of Cassandra are you using?

Cassandra 3.11.11

What version of Gocql are you using?

github.com/gocql/gocql@latest

What version of Go are you using?

go version go1.15.14 windows/amd64

What did you do?

func getClusterAndSession(option Option) (*gocql.ClusterConfig, *gocql.Session, error) {
	...

	cluster := gocql.NewCluster(cass...)
	cluster.Authenticator = gocql.PasswordAuthenticator{
		Username: user,
		Password: crypto.AesDecrypt(pass, crypto.EncryptKey),
	}
	cluster.NumConns = numConns
	cluster.SocketKeepalive = time.Millisecond * 500
	cluster.Keyspace = strings.ToLower(option.Keyspace)
	cluster.Timeout = timeoutSecond * time.Second
	cluster.ProtoVersion = protoVersion
	cluster.PoolConfig.HostSelectionPolicy = gocql.TokenAwareHostPolicy(gocql.RoundRobinHostPolicy()) // This line causes panic
	cluster.RetryPolicy = &gocql.SimpleRetryPolicy{NumRetries: retryNum}
	cluster.DisableInitialHostLookup = option.DisableInitialHostLookup // If true, no token aware. Do not use in a production environment
	// "ANY", "ONE", "TWO", "THREE", "QUORUM", "ALL", "LOCAL_QUORUM", "EACH_QUORUM", "LOCAL_ONE"
	if consistency, err := env.GetVars("CONSISTENCY"); err == nil {
		cluster.Consistency = gocql.ParseConsistency(strings.ToUpper(consistency[0]))
	} else {
		cluster.Consistency = gocql.LocalQuorum
	}

	if session, err := cluster.CreateSession(); err != nil {
		return nil, nil, err
	} else {
		return cluster, session, nil
	}
}

What did you expect to see?

I hope it can create the connection normally

What did you see instead?

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xb4e466]

goroutine 1 [running]:
github.com/gocql/gocql.compileMetadata(0x4, 0xc000550660, 0xc000780000, 0x17e, 0x249, 0xc000bd2000, 0x1934, 0x1d99, 0xc00015b200, 0xe, ...)
        D:/Develop/gocode/pkg/mod/github.com/gocql/gocql@v0.0.0-20211015133455-b225f9b53fa1/metadata.go:330 +0x2e6
github.com/gocql/gocql.(*schemaDescriber).refreshSchema(0xc00050a860, 0xe8c89b, 0x6, 0x6, 0x153e8e0)
        D:/Develop/gocode/pkg/mod/github.com/gocql/gocql@v0.0.0-20211015133455-b225f9b53fa1/metadata.go:293 +0x40d
github.com/gocql/gocql.(*schemaDescriber).getSchema(0xc00050a860, 0xe8c89b, 0x6, 0x0, 0x0, 0x0)
        D:/Develop/gocode/pkg/mod/github.com/gocql/gocql@v0.0.0-20211015133455-b225f9b53fa1/metadata.go:237 +0x109
github.com/gocql/gocql.(*Session).KeyspaceMetadata(0xc000218000, 0xe8c89b, 0x6, 0xc000291980, 0x0, 0x10)
        D:/Develop/gocode/pkg/mod/github.com/gocql/gocql@v0.0.0-20211015133455-b225f9b53fa1/session.go:526 +0x91
github.com/gocql/gocql.(*tokenAwareHostPolicy).updateReplicas(0xc00055a100, 0xc000291980, 0xe8c89b, 0x6)
        D:/Develop/gocode/pkg/mod/github.com/gocql/gocql@v0.0.0-20211015133455-b225f9b53fa1/policies.go:414 +0x9e
github.com/gocql/gocql.(*tokenAwareHostPolicy).KeyspaceChanged(0xc00055a100, 0xe8c89b, 0x6, 0x0, 0x0)
        D:/Develop/gocode/pkg/mod/github.com/gocql/gocql@v0.0.0-20211015133455-b225f9b53fa1/policies.go:404 +0x99
github.com/gocql/gocql.(*Session).init(0xc000218000, 0xc000426000, 0x0)
        D:/Develop/gocode/pkg/mod/github.com/gocql/gocql@v0.0.0-20211015133455-b225f9b53fa1/session.go:322 +0x775
github.com/gocql/gocql.NewSession(0xc0002903c0, 0x1, 0x1, 0xe8b00a, 0x5, 0x4, 0x2540be400, 0x23c34600, 0x2352, 0xe8c89b, ...)
        D:/Develop/gocode/pkg/mod/github.com/gocql/gocql@v0.0.0-20211015133455-b225f9b53fa1/session.go:173 +0x8a9
github.com/gocql/gocql.(*ClusterConfig).CreateSession(...)
        D:/Develop/gocode/pkg/mod/github.com/gocql/gocql@v0.0.0-20211015133455-b225f9b53fa1/cluster.go:205
git.wecise.com/wecise/common/cassandra.getClusterAndSession(0xe8c89b, 0x6, 0xc000209e00, 0xc000744000, 0x0, 0x0, 0xc000209d70)
        D:/Develop/projects/wecise/common/cassandra/cassandra.go:137 +0x678
git.wecise.com/wecise/common/cassandra.GetClientByOption(0xe8c89b, 0x6, 0xc000744000, 0x3, 0x0, 0x0)
        D:/Develop/projects/wecise/common/cassandra/cassandra.go:156 +0x45
git.wecise.com/wecise/common/multitenant.Init(0xf757c0, 0xc00000e018)
        D:/Develop/projects/wecise/common/multitenant/multitenant.go:308 +0x207
main.main()
        D:/Develop/projects/wecise/common/testrun.go:18 +0x9c

If you are having connectivity related issues please share the following additional information

Describe your Cassandra cluster

please provide the following information

  • output of nodetool status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens       Owns (effective)  Host ID                               Rack
UN  172.26.38.247  11.25 GiB  16           100.0%            c1883d65-9fe1-4fd2-90bf-f92839183a2d  rack1
  • output of SELECT peer, rpc_address FROM system.peers
cassandra@cqlsh> SELECT peer, rpc_address FROM system.peers;

 peer | rpc_address
------+-------------

(0 rows)
  • rebuild your application with the gocql_debug tag and post the output
@evildecay evildecay changed the title oken aware host policy causes panic Token aware host policy causes panic Oct 22, 2021
@martin-sucha
Copy link
Contributor

Hello @evildecay, sorry for the late reply.

Based on the stack trace above it seems that there is a final_function function referenced in user defined aggregates as returned by:

SELECT
			aggregate_name,
			argument_types,
			final_func,
			initcond,
			return_type,
			state_func,
			state_type
		FROM system_schema.aggregates
		WHERE keyspace_name = ?

but that function is not present in the list of functions as returned by:

SELECT
			function_name,
			argument_types,
			argument_names,
			body,
			called_on_null_input,
			language,
			return_type
		FROM system_schema.functions
		WHERE keyspace_name = ?

Could you please try to check the results of these statements on your instance if you still can reproduce and verify that it's the case?

Of course the driver should not panic in this case, so we should handle this in some way. It is not clear to me at this point what the correct handling is though. I don't have experience with aggregates in Cassandra. Is it possible this state regularly happens or is it some inconsistency on server side? In other words, should we just ignore the aggregate or do we want to return an error from KeyspaceMetadata instead?

@evildecay
Copy link
Contributor Author

Hi @martin-sucha

Query result:

cassandra@cqlsh> SELECT aggregate_name, argument_types, final_func, initcond, return_type, state_func, state_type FROM system_schema.aggregates WHERE keyspace_name = 'matrix';

 aggregate_name | argument_types   | final_func | initcond | return_type     | state_func | state_type
----------------+------------------+------------+----------+-----------------+------------+-----------------
           enum |          ['int'] |       null |       {} |   map<int, int> | enum_count |   map<int, int>
           enum |         ['text'] |       null |       {} |  map<text, int> | enum_count |  map<text, int>
           enum | ['text', 'text'] |       null |       {} | map<text, text> | enum_count | map<text, text>
           enum |    ['timestamp'] |       null |       {} |  map<text, int> | enum_count |  map<text, int>

(4 rows)
cassandra@cqlsh> SELECT function_name, argument_types, argument_names, body, called_on_null_input, language, return_type FROM system_schema.functions WHERE keyspace_name = 'matrix';

 function_name | argument_types                      | argument_names             | body                                                                                                                                                                                                                                                                                     | called_on_null_input | language | return_type
---------------+-------------------------------------+----------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------+----------+-----------------
    enum_count |            ['map<int, int>', 'int'] |          ['state', 'type'] |                                                                                                                                           if(type!=null){ Integer count = (Integer) state.get(type); if (count == null) count = 1; else count++; state.put(type, count);} return state;  |                 True |     java |   map<int, int>
    enum_count |          ['map<text, int>', 'text'] |          ['state', 'type'] |                                                                                                                                            if(type!=null){Integer count = (Integer) state.get(type); if (count == null) count = 1; else count++; state.put(type, count);} return state;  |                 True |     java |  map<text, int>
    enum_count |     ['map<text, int>', 'timestamp'] |          ['state', 'type'] |                       if(type!=null){  java.text.SimpleDateFormat sdf = new java.text.SimpleDateFormat("yyyy-MM-dd"); String mydate=sdf.format(type); Integer count = (Integer) state.get(mydate); if (count == null) count = 1; else count++; state.put(mydate, count);} return state;  |                 True |     java |  map<text, int>
    enum_count | ['map<text, text>', 'text', 'text'] | ['state', 'group', 'type'] |                                     if(group!=null){String tenum = (String) state.get(group); if (tenum == null) tenum = type; else if (!(tenum.contains(","+type) || tenum.startsWith(type+",") || tenum.equals(type)))  tenum=tenum+","+type; state.put(group, tenum);} return state;  |                 True |     java | map<text, text>
        format |               ['timestamp', 'text'] |       ['input', 'pattern'] |                                                                                               \n\t\tjava.text.SimpleDateFormat sdf = new java.text.SimpleDateFormat(pattern);\n\t\ttry{\n\t\t\treturn sdf.format(input); \n\t\t}catch(Exception e){\n\t\t\treturn null;\n\t\t}      \n\t |                False |     java |            text
         lower |                            ['text'] |                    ['str'] |                                                                                                                                                                                                                                                              return str.toLowerCase() ;  |                False |     java |            text
         ltrim |                            ['text'] |                    ['str'] |                                                                                                                                                                                                                                                    return str.replaceAll("^\\s*", "") ;  |                False |     java |            text
       referid |    ['map<text, frozen<set<text>>>'] |                      ['x'] |                                                                                                              \n Set<String> vals = x.get("_all");\n if (vals==null) {\nreturn null ;\n }else if (vals.size() > 0){\nreturn ((String[])vals.toArray())[0] ;\n }else{\nreturn null ;\n }\n |                False |     java |            text
      referids |    ['map<text, frozen<set<text>>>'] |                      ['x'] |                                                                                                                                                 \n Set<String> vals = x.get("_all");\n if (vals==null) {\nreturn null ;\n }else{\nreturn Arrays.asList((String[])vals.toArray()) ;\n }\n |                False |     java |      list<text>
         rtrim |                            ['text'] |                    ['str'] |                                                                                                                                                                                                                                                     return str.replaceAll("\\s*$", "") ; |                False |     java |            text
     substring |              ['text', 'int', 'int'] |       ['str', 'st', 'len'] |                                                                                                                                                                                                                                                      return str.substring(st, st+len) ;  |                False |     java |            text
     timestamp |                    ['text', 'text'] |       ['input', 'pattern'] |                                                                                                \n\t\tjava.text.SimpleDateFormat sdf = new java.text.SimpleDateFormat(pattern);\n\t\ttry{\n\t\t\treturn sdf.parse(input); \n\t\t}catch(Exception e){\n\t\t\treturn null;\n\t\t}      \n\t |                False |     java |       timestamp
   tojobstatus |                             ['int'] |                      ['x'] | \n   switch(x) { case 1: return "Ready";\tcase 2: return "Waiting"; case 3: return "Running";case 4: return "Succeeded";case 5: return "Failed";case 6: return "Cancelled";\tcase 7: return "Skipped";case 8: return "Timeout";case 0:\treturn "Unknown";default:  return "Unknown"; }   |                False |     java |            text
          trim |                            ['text'] |                    ['str'] |                                                                                                                                                                                                                                                                     return str.trim() ;  |                False |     java |            text
        uint64 |                          ['bigint'] |                      ['x'] |                                                                                                                                                                                                                                                        return Long.toUnsignedString(x);  |                False |     java |            text
         upper |                            ['text'] |                    ['str'] |                                                                                                                                                                                                                                                              return str.toUpperCase() ;  |                False |     java |            text

(16 rows)

@martin-sucha martin-sucha changed the title Token aware host policy causes panic KeyspaceMetadata causes panic when describing aggregates Dec 22, 2021
@martin-sucha
Copy link
Contributor

Ah, there is null in final_func. And I see now that final function is optional in user-defined aggregated (docs). Unfortunately the KeyspaceMetadata types don't account for this case.

martin-sucha added a commit to martin-sucha/gocql that referenced this issue Dec 22, 2021
There were the following issues with handling of aggregates:

- compileMetadata panicked if final_func was null.
  final_func is optional[1], so null is perfectly valid value.
- The AggregateMetadata FinalFunc field could not store nil,
  so it is not possible to represent the null.
- Functions and aggregates are not uniquely identified just by name.
  There could be two functions with the same name, but different
  argument types.

I'm adding a new version of function and aggregate metadata to fix
those shortcomings. For backward compatibility, I leave the V1
metadata filled-in where possible.

Fixes gocql#1587

[1] https://cassandra.apache.org/doc/latest/cassandra/cql/functions.html#user-defined-aggregates-functions
@martin-sucha martin-sucha linked a pull request Dec 22, 2021 that will close this issue
@martin-sucha
Copy link
Contributor

Hello @evildecay! I think I have a fix. Could you please verify that #1590 fixes the issue for you?

martin-sucha added a commit to martin-sucha/gocql that referenced this issue Apr 13, 2022
There were the following issues with handling of aggregates:

- compileMetadata panicked if final_func was null.
  final_func is optional[1], so null is perfectly valid value.
- The AggregateMetadata FinalFunc field could not store nil,
  so it is not possible to represent the null.
- Functions and aggregates are not uniquely identified just by name.
  There could be two functions with the same name, but different
  argument types.

I'm adding a new version of function and aggregate metadata to fix
those shortcomings. For backward compatibility, I leave the V1
metadata filled-in where possible.

Fixes gocql#1587

[1] https://cassandra.apache.org/doc/latest/cassandra/cql/functions.html#user-defined-aggregates-functions
@evildecay
Copy link
Contributor Author

@martin-sucha Tested the gocql 1.2.0 version, it should have fixed this problem. awesome! This issue can be closed

@martin-sucha
Copy link
Contributor

@evildecay gocql 1.2.0 does not include the changes from #1590 though, so I'm not sure if we can close this.

Is it possible that the data in your SELECT aggregate_name, argument_types, final_func, initcond, return_type, state_func, state_type FROM system_schema.aggregates WHERE keyspace_name = 'matrix'; has changed and the final_func is non-null now?

@OleksiienkoMykyta
Copy link

I will try to reproduce this panic with given information, to make sure if this issue could be closed

@OleksiienkoMykyta
Copy link

OleksiienkoMykyta commented May 22, 2024

I have tested this case, but it works properly, so the final_func could be null. I think this issue should be closed.
What version of Cassandra are you using?
Cassandra 3.11.11

ProtoVersion :4

What version of Gocql are you using?
github.com/gocql/gocql@1.2.0
also tested on 1.6.0

What version of Go are you using?
go version go1.15.14
also tested on 1.22.1

package gocql_test

import (
	"fmt"
	"log"
	"testing"
	"time"
	
	"github.com/gocql/gocql"
)

func TestGetClusterAndSession(t *testing.T) {
	cluster := gocql.NewCluster("localhost:9042")
	cluster.Authenticator = gocql.PasswordAuthenticator{
		Username: "cassandra",
		Password: "cassandra",
	}
	cluster.NumConns = 2
	cluster.SocketKeepalive = time.Millisecond * 500
	cluster.Timeout = 15 * time.Second
	cluster.ProtoVersion = 4
	cluster.PoolConfig.HostSelectionPolicy = gocql.TokenAwareHostPolicy(gocql.RoundRobinHostPolicy()) // according to issue this code causes panic, it`s not though.
	cluster.RetryPolicy = &gocql.SimpleRetryPolicy{NumRetries: 5}
	cluster.DisableInitialHostLookup = true
	// "ANY", "ONE", "TWO", "THREE", "QUORUM", "ALL", "LOCAL_QUORUM", "EACH_QUORUM", "LOCAL_ONE"

	cluster.Consistency = gocql.Any // tested on diferent consistencyes

	if session, err := cluster.CreateSession(); err != nil {
		panic(err)
	} else {
		// Create a keyspace
		keyspace := "my_keyspace"
		replication := "{ 'class' : 'SimpleStrategy', 'replication_factor' : 1 }"

		err = session.Query(fmt.Sprintf("CREATE KEYSPACE IF NOT EXISTS %s WITH replication = %s", keyspace, replication)).Exec()
		if err != nil {
			log.Fatal("Unable to create keyspace: ", err)
		}

		fmt.Println("Keyspace created successfully")
	}

}

The output for test

=== RUN   TestGetClusterAndSession
Keyspace created successfully
--- PASS: TestGetClusterAndSession (0.02s)
PASS

Process finished with the exit code 0

Output for cql requests

SELECT function_name, argument_types, argument_names, body, called_on_null_input, language, return_type FROM system_schema.functions WHERE keyspace_name = 'test';
 function_name  | argument_types | argument_names   | body                | called_on_null_input | language | return_type
----------------+----------------+------------------+---------------------+----------------------+----------+-------------
    add_numbers | ['int', 'int'] |       ['a', 'b'] |       return a + b; |                False |     java |         int
 state_function | ['int', 'int'] | ['state', 'val'] | return state + val; |                False |     java |         int

SELECT aggregate_name, argument_types, final_func, initcond, return_type, state_func, state_type FROM system_schema.aggregates WHERE keyspace_name ='test';
 aggregate_name | argument_types | final_func | initcond | return_type | state_func     | state_type
----------------+----------------+------------+----------+-------------+----------------+------------
  sum_aggregate |        ['int'] |       null |        0 |         int | state_function |        int

SELECT peer, rpc_address FROM system.peers;
 peer | rpc_address
------+-------------

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants