Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect calculation of canonical schema #267

Open
timou opened this issue Apr 14, 2023 · 1 comment
Open

Incorrect calculation of canonical schema #267

timou opened this issue Apr 14, 2023 · 1 comment

Comments

@timou
Copy link

timou commented Apr 14, 2023

For some valid schemas, goavro will incorrectly compute the canonical form. This results in an incorrect Rabin fingerprint. Due to the random nature of map iteration in golang, the fingerprint for the exact same schema can change between consecutive runs, which should not happen.

A unit test is included below that can reproduce the issue. I also have a fix in a PR. I am not sure the fix is sound, however all tests pass.

The problem occurs in schemas with nested types with the same name, combined with the use of namespaces. Take the following schema:

{
  "name" : "thename",
  "type" : "record",
  "namespace" : "com.fooname",
  "fields" : [
    {
      "name" : "bar" ,
      "type" : {
        "name" : "bar",
        "type" : "record",
        "fields" : [
          {
            "name" : "car",
            "type" : "int"
          }
        ]
      }
    }
  ]
}

The bug appears to occur when the inner bar is processed before the outer bar (but not vice-versa). In that case, the inner bar is properly namespaced to com.fooname.bar. However, an entry is made in typeLookup for the name foo to this namespaced value. The issue manifests when the outer foo is subsequently processed, as due to the name collision, it incorrectly also gets namespaced to com.fooname.bar. It should be bar in canonical form. If the outer bar is processed first, it works correctly, but this is random, due to random map iteration in golang.

The following test checks for the bug, by checking multiple runs of canonicalizations of the above schema generating different Rabin fingerprints. I have a PR that I will submit which adds a fix, but I am not sure the fix is correct in all cases.

func TestCanonicalSchemaFingerprintFlake(t *testing.T) {
	const schema = `{
  "name" : "thename",
  "type" : "record",
  "namespace" : "com.fooname",
  "fields" : [
    {
      "name" : "bar" ,
      "type" : {
        "name" : "bar",
        "type" : "record",
        "fields" : [
          {
            "name" : "car",
            "type" : "int"
          }
        ]
      }
    }
  ]
}
`
	codec, err := NewCodec(schema)
	if err != nil {
		t.Fatalf("unexpected schema parse failure, err=%v", err)
	}
	prevRun := codec.Rabin

	// This test is flaky. It exposes a bug that manifests due to
	// randomized map iteration. However, 32 iterations should give
	// high probability to see the failure.
	for i := 0; i < 32; i++ {
		codec, err = NewCodec(schema)
		currentRun := codec.Rabin
		if prevRun != currentRun {
			t.Fatalf("same schema should always have same fingerprint, rabin1: %d, rabin2: %d", prevRun, currentRun)
		}
		prevRun = currentRun
	}
}
@timou
Copy link
Author

timou commented Apr 14, 2023

See proposed fix in #268.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant