Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Liquibase exception: Invalid string encoding on column.remarks. #113

Closed
Dzeri96 opened this issue Apr 19, 2024 · 12 comments
Closed

Liquibase exception: Invalid string encoding on column.remarks. #113

Dzeri96 opened this issue Apr 19, 2024 · 12 comments
Assignees

Comments

@Dzeri96
Copy link

Dzeri96 commented Apr 19, 2024

Description of the Issue

I'm testing the new DDL export feature and unfortunately, I've immediately run into a problem. This is the stack trace:

Jailer 16.1

liquibase.exception.UnexpectedLiquibaseException: Invalid string encoding on column.remarks. To resolve, remove the invalid character on the database and try again in changeSet tmp/up-76ece3f9-3a7e-4d37-88d9-c36a0b26975d.xml::1713526740023-2494::ATE4641769 (generated). To resolve, remove the invalid character on the database and try again
	at liquibase.serializer.core.xml.XMLChangeLogSerializer.createNode(XMLChangeLogSerializer.java:195)
	at liquibase.serializer.core.xml.XMLChangeLogSerializer.write(XMLChangeLogSerializer.java:153)
	at liquibase.diff.output.changelog.DiffToChangeLog.printNew(DiffToChangeLog.java:268)
	at liquibase.diff.output.changelog.DiffToChangeLog$1.run(DiffToChangeLog.java:178)
	at liquibase.Scope.lambda$child$0(Scope.java:190)
	at liquibase.Scope.child(Scope.java:199)
	at liquibase.Scope.child(Scope.java:189)
	at liquibase.Scope.child(Scope.java:168)
	at liquibase.diff.output.changelog.DiffToChangeLog.print(DiffToChangeLog.java:172)
	at liquibase.diff.output.changelog.DiffToChangeLog.print(DiffToChangeLog.java:126)
	at liquibase.command.core.GenerateChangelogCommandStep.run(GenerateChangelogCommandStep.java:157)
	at liquibase.command.CommandScope.execute(CommandScope.java:219)
	at net.sf.jailer.ui.ddl_script_generator.DDLScriptGeneratorPanel.generateChangeLog(DDLScriptGeneratorPanel.java:418)
	at net.sf.jailer.ui.ddl_script_generator.DDLScriptGeneratorPanel.doGenerate(DDLScriptGeneratorPanel.java:263)
	at net.sf.jailer.ui.ddl_script_generator.DDLScriptGeneratorPanel.lambda$okButtonActionPerformed$5(DDLScriptGeneratorPanel.java:738)
	at java.base/java.lang.Thread.run(Thread.java:833)

When I use liquibase by itself on the exact same database, I can successfully export the entire schema, so there must be something in the way it's being called from Jailer that's causing the issue.
The error message does not say what exact object liquibase is having a problem with.

I think I'll need your help so we can debug this in multiple steps.

Steps to Reproduce the Issue

Unfortunately I can't share much about the underlying database as it does not belong to me. The way I got this error is by using the "Generate DDL Script" option with default settings. The options for the standalone liquibase setup, that already works, are nothing special, just the most basic settings are set. I just had to increase the JVM stack size since the schema is very large.

Debug Information

Jailer: 16.1
liquibase cli: 4.26.0
Oracle DB: v11

@Wisser Wisser self-assigned this Apr 19, 2024
@Wisser
Copy link
Owner

Wisser commented Apr 19, 2024

Thank you for the report.

I have looked at the liquibase source code and it seems that when writing the changelog as an xml file, a checking for characters that are problematic for xml takes place. In your case there are probably such characters in a column comment.

Could you please test this version 16.1.1? https://sourceforge.net/projects/jailer/files/stage

This will write the changelog as a json file. I hope this is more robust against such character problems.

@Dzeri96
Copy link
Author

Dzeri96 commented Apr 21, 2024

I'll be able to test it tomorrow. On that note, I've always just used SQL as the output format and that definitely worked. I also assumed Jailer would use the same, but now that I think about it, it does support multiple output formats. I think it's best if we also let the user choose the DDL format. Liquibase definitely supports many.

@Wisser
Copy link
Owner

Wisser commented Apr 26, 2024

I managed to reproduce this error myself. It has been corrected in release 16.1.2.
In addition, “Drop ...” statements for removing the database objects can now also be generated.

@Dzeri96
Copy link
Author

Dzeri96 commented Apr 29, 2024

Sorry for the delayed response... This is the new error I'm getting with 16.1.2:

Mail: rwisser@users.sourceforge.net

Jailer 16.1.2
java.nio.charset.MalformedInputException: Input length = 1
	at java.base/java.nio.charset.CoderResult.throwException(CoderResult.java:274)
	at java.base/sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
	at java.base/sun.nio.cs.StreamDecoder.read(StreamDecoder.java:188)
	at java.base/java.io.InputStreamReader.read(InputStreamReader.java:177)
	at org.yaml.snakeyaml.reader.UnicodeReader.read(UnicodeReader.java:118)
	at org.yaml.snakeyaml.reader.StreamReader.update(StreamReader.java:180)
	at org.yaml.snakeyaml.reader.StreamReader.ensureEnoughData(StreamReader.java:173)
	at org.yaml.snakeyaml.reader.StreamReader.peek(StreamReader.java:133)
	at org.yaml.snakeyaml.scanner.ScannerImpl.scanPlain(ScannerImpl.java:2046)
	at org.yaml.snakeyaml.scanner.ScannerImpl.fetchPlain(ScannerImpl.java:1074)
	at org.yaml.snakeyaml.scanner.ScannerImpl.fetchMoreTokens(ScannerImpl.java:432)
	at org.yaml.snakeyaml.scanner.ScannerImpl.checkToken(ScannerImpl.java:238)
	at org.yaml.snakeyaml.parser.ParserImpl$ParseBlockMappingValue.produce(ParserImpl.java:669)
	at org.yaml.snakeyaml.parser.ParserImpl.peekEvent(ParserImpl.java:161)
	at org.yaml.snakeyaml.comments.CommentEventsCollector$1.peek(CommentEventsCollector.java:57)
	at org.yaml.snakeyaml.comments.CommentEventsCollector$1.peek(CommentEventsCollector.java:43)
	at org.yaml.snakeyaml.comments.CommentEventsCollector.collectEvents(CommentEventsCollector.java:136)
	at org.yaml.snakeyaml.comments.CommentEventsCollector.collectEvents(CommentEventsCollector.java:116)
	at org.yaml.snakeyaml.composer.Composer.composeScalarNode(Composer.java:241)
	at org.yaml.snakeyaml.composer.Composer.composeNode(Composer.java:205)
	at org.yaml.snakeyaml.composer.Composer.composeKeyNode(Composer.java:359)
	at org.yaml.snakeyaml.composer.Composer.composeMappingChildren(Composer.java:344)
	at org.yaml.snakeyaml.composer.Composer.composeMappingNode(Composer.java:323)
	at org.yaml.snakeyaml.composer.Composer.composeNode(Composer.java:209)
	at org.yaml.snakeyaml.composer.Composer.composeValueNode(Composer.java:369)
	at org.yaml.snakeyaml.composer.Composer.composeMappingChildren(Composer.java:348)
	at org.yaml.snakeyaml.composer.Composer.composeMappingNode(Composer.java:323)
	at org.yaml.snakeyaml.composer.Composer.composeNode(Composer.java:209)
	at org.yaml.snakeyaml.composer.Composer.composeSequenceNode(Composer.java:277)
	at org.yaml.snakeyaml.composer.Composer.composeNode(Composer.java:207)
	at org.yaml.snakeyaml.composer.Composer.composeValueNode(Composer.java:369)
	at org.yaml.snakeyaml.composer.Composer.composeMappingChildren(Composer.java:348)
	at org.yaml.snakeyaml.composer.Composer.composeMappingNode(Composer.java:323)
	at org.yaml.snakeyaml.composer.Composer.composeNode(Composer.java:209)
	at org.yaml.snakeyaml.composer.Composer.composeValueNode(Composer.java:369)
	at org.yaml.snakeyaml.composer.Composer.composeMappingChildren(Composer.java:348)
	at org.yaml.snakeyaml.composer.Composer.composeMappingNode(Composer.java:323)
	at org.yaml.snakeyaml.composer.Composer.composeNode(Composer.java:209)
	at org.yaml.snakeyaml.composer.Composer.composeSequenceNode(Composer.java:277)
	at org.yaml.snakeyaml.composer.Composer.composeNode(Composer.java:207)
	at org.yaml.snakeyaml.composer.Composer.composeValueNode(Composer.java:369)
	at org.yaml.snakeyaml.composer.Composer.composeMappingChildren(Composer.java:348)
	at org.yaml.snakeyaml.composer.Composer.composeMappingNode(Composer.java:323)
	at org.yaml.snakeyaml.composer.Composer.composeNode(Composer.java:209)
	at org.yaml.snakeyaml.composer.Composer.composeValueNode(Composer.java:369)
	at org.yaml.snakeyaml.composer.Composer.composeMappingChildren(Composer.java:348)
	at org.yaml.snakeyaml.composer.Composer.composeMappingNode(Composer.java:323)
	at org.yaml.snakeyaml.composer.Composer.composeNode(Composer.java:209)
	at org.yaml.snakeyaml.composer.Composer.composeSequenceNode(Composer.java:277)
	at org.yaml.snakeyaml.composer.Composer.composeNode(Composer.java:207)
	at org.yaml.snakeyaml.composer.Composer.composeValueNode(Composer.java:369)
	at org.yaml.snakeyaml.composer.Composer.composeMappingChildren(Composer.java:348)
	at org.yaml.snakeyaml.composer.Composer.composeMappingNode(Composer.java:323)
	at org.yaml.snakeyaml.composer.Composer.composeNode(Composer.java:209)
	at org.yaml.snakeyaml.composer.Composer.getNode(Composer.java:131)
	at org.yaml.snakeyaml.composer.Composer.getSingleNode(Composer.java:157)
	at org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(BaseConstructor.java:178)
	at org.yaml.snakeyaml.Yaml.loadFromReader(Yaml.java:493)
	at org.yaml.snakeyaml.Yaml.load(Yaml.java:434)
	at liquibase.parser.core.yaml.YamlChangeLogParser.parseYamlStream(YamlChangeLogParser.java:103)
	at liquibase.parser.core.yaml.YamlChangeLogParser.parse(YamlChangeLogParser.java:40)
	at liquibase.command.core.helpers.DatabaseChangelogCommandStep.lambda$getDatabaseChangeLog$0(DatabaseChangelogCommandStep.java:129)
	at liquibase.Scope.child(Scope.java:199)
	at liquibase.Scope.child(Scope.java:175)
	at liquibase.command.core.helpers.DatabaseChangelogCommandStep.getDatabaseChangeLog(DatabaseChangelogCommandStep.java:128)
	at liquibase.command.core.helpers.DatabaseChangelogCommandStep.run(DatabaseChangelogCommandStep.java:87)
	at liquibase.command.CommandScope.execute(CommandScope.java:219)
	at liquibase.Liquibase.lambda$update$1(Liquibase.java:340)
	at liquibase.Scope.lambda$child$0(Scope.java:190)
	at liquibase.Scope.child(Scope.java:199)
	at liquibase.Scope.child(Scope.java:189)
	at liquibase.Scope.child(Scope.java:168)
	at liquibase.Liquibase.runInScope(Liquibase.java:1436)
	at liquibase.Liquibase.update(Liquibase.java:330)
	at net.sf.jailer.ui.ddl_script_generator.DDLScriptGeneratorPanel.doGenerate(DDLScriptGeneratorPanel.java:455)
	at net.sf.jailer.ui.ddl_script_generator.DDLScriptGeneratorPanel.lambda$okButtonActionPerformed$5(DDLScriptGeneratorPanel.java:960)
	at java.base/java.lang.Thread.run(Thread.java:833)```

I don't know which format Jailer is trying to use, but plain SQL should work in my case. I think the user should be able to select it.

@Wisser
Copy link
Owner

Wisser commented Apr 29, 2024

Thanks for the info.

16.1.2 used yaml as the format for the change-log, from which the SQL script is then generated. This indirection has the advantage that the user can select a target DBMS that differs from the source DBMS.

In your database, a column comment contains an invalid character that causes problems when serializing to xml. That's why I used yaml instead. However, I had also tried to correct invalid characters. But that obviously doesn't work well.

I have removed the correction stuff now. Could you please check the two test versions here?

https://sourceforge.net/projects/jailer/files/stage/

@Dzeri96
Copy link
Author

Dzeri96 commented Apr 30, 2024

I've tested the JSON version and it seems to work now, however it seems like all tables are included, not just ones in the extraction model. The DB schema I'm working on is massive and I was hoping Jailer could do its subsetting magic on the DDL too. It shouldn't be hard at all since an includeObject option exists in liquibase already.

@Wisser
Copy link
Owner

Wisser commented Apr 30, 2024

Do you mean something like the one discussed here?
Add schema generation for extracted database

I was a little skeptical that it would actually make sense. What do you think?

Liquibase's API is a bit strangely designed. I actually haven't figured out how to pass parameters like includeObject. I'll have to do some more research. In addition, this feature is still missing.

Nevertheless, you could use Jailer's CLI command “print-closure” (see linked issue) to create the command line for liquibase that generates the DDL for exactly these tables.

@Dzeri96
Copy link
Author

Dzeri96 commented May 2, 2024

Do you mean something like the one discussed here?

Yes, that's exactly what I meant. I think there are many cases today where a bunch of applications share the same DB and ours is definitely one of them. But regarding what the original poster in the issue said, I don't know how feasible it is to extract related functions/procedures.

In addition, this liquibase/liquibase#5746 is still missing.

Yeah I kind of promised I'll try and implement it in liquibase directly, however I don't see jailer needing this feature at all, since it already knows which tables depend on each other. The print-closure thing looks like a source for includeObject. It would just be nice to have it routed automatically.

In any case, I think this issue can be closed when you release the JSON version of the DDL export. We can continue the "schema subset" discussion in another thread, just tag me there.

@Wisser
Copy link
Owner

Wisser commented May 2, 2024

I have introduced an option “(include) Tables associated with a subject table” in the “Generate DDL” dialog, which restricts the generation to the closure of the extraction model. Please try this out:

(deleted)

Presumably an option “(include) Tables in data model” would also be helpful for you. This would then only take into account the tables that are actually in the model. This makes a difference if there are additional tables from other applications in the schema. This option will come with the next release.

The Liquibase feature that you requested is actually not so important, as generally all parent tables are also in the closure (and also in the data model) with every table.

@Wisser
Copy link
Owner

Wisser commented May 2, 2024

Here you will find an updated version with the option mentioned above.

https://sourceforge.net/projects/jailer/files/stage/jailer_16.1.2-closure2.zip

Regarding "functions/procedures":
these are not supported in the Liquibase Open Source Edition.
Sequences are missing if the “includeObjects” parameter is specified. The same applies to views that are not part of the data model.

@Dzeri96
Copy link
Author

Dzeri96 commented May 2, 2024

I've tested the linked version on an existing extraction model and it's looking very good. Even the added subjects are there and so far there don't seem to be any missing objects.

For reference, my extraction model contains around 40 tables, whereas the entire data model has around 3500. You can see how this makes a difference, especially as we try to automate this process and create DBs for integration testing.

@Wisser
Copy link
Owner

Wisser commented May 3, 2024

That's great. I have finally published this now (16.1.3).

Btw, additional subjects with an unsatisfiable "Where" condition (1=0 or similar) are perhaps a good way to extend the closure of an extraction model without affecting the extraction itself.
In case you want to generate the DDL of tables that are not yet in the closure as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants