Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggesting transition to new approach of acquiring column name #168

Open
dongwook-chan opened this issue Sep 17, 2023 · 0 comments
Open

Comments

@dongwook-chan
Copy link

@the4thdoctor

I'm a contributor to python-mysql-replication and you might remember me from the issue. I would like your opinion on eliminating table_map in python-mysql-replication.

Background:
The current approach in python-mysql-replication for gathering column schema is to SELECT information_schema.columns. pg_chameleon seems to be referring to column name in specific.

for column_name in event_after:
try:
column_type=column_map[column_name]
except KeyError:
self.logger.debug("Detected inconsistent structure for the table %s. The replay may fail. " % (table_name))
column_type = 'text'
if column_type in self.hexify and event_after[column_name]:
event_after[column_name]=binascii.hexlify(event_after[column_name]).decode()
elif column_type in self.hexify and isinstance(event_after[column_name], bytes):
event_after[column_name] = ''
elif column_type == 'json':
event_after[column_name] = self.__decode_dic_keys(event_after[column_name])
elif column_type in self.spatial_datatypes and event_after[column_name]:
event_after[column_name] = self.__get_text_spatial(event_after[column_name])

I have used python-mysql-replication connecting to MySQL which serves 2,500+ qps. Under circumstances where replication gap exists, the result of SELECT would represent the column schema at the time of execution of SELECT rather than the time when the event was generated. This results in receiving wrong column names, maybe in wrong orders or dummy column names that does not exist in the MySQL.

Concern:
Given that pf_chameleon depends on python-mysql-replication, I wanted to get your input on potential disruptions. The old approach (SELECTing information_schema) could have had its own set of issues or limitations that users of pf_chameleon might have faced.

Proposed Solutions:

Drop support for the old approach and parse optional_metadata which holds column names. This could lead to a cleaner codebase but might introduce breaking changes for those who rely on the old behavior.
julien-duponchelle/python-mysql-replication#477

I would like to know what you think about this change. I will assist you with anything I can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant