Add standard newline/quoting behavior to dotenv store #622

scjudd · 2020-02-04T04:34:03Z

Rationale

The dotenv store as it exists right now performs splitting on newlines to determine where a new key-value pair or comment begins. This works remarkably well, up until you need to handle values that contain newlines.

While I couldn't find an offical dotenv file format spec, I sampled a number of open-source dotenv parsers and it seems that they typically apply the following rules:

Newline handling:

If a value is unquoted and contains a literal \n (0x5c6e), it is interpretted literally and NOT converted to an actual newline (0x0a).
If a value is single-quoted and contains a literal \n (0x5c6e), it is interpretted literally and NOT converted to an actual newline (0x0a).
If a value is double-quoted and contains a literal \n (0x5c6e), it is converted to an actual newline (0x0a).
If a value is either single- or double-quoted, it may contain an actual newline (0x0a).

Whitespace trimming:

If a value is unquoted and contains any leading or trailing whitespace, it is trimmed.
If a value is either single- or double-quoted and contains any leading or trailing whitespace, it is left untrimmed.

Quotation handling:

Because quotations around values have special meaning, they are interpretted and are not included in the parsed value. Literal quotes may be included within a quoted string either by escaping them or using the opposite quotation mark.

Because single- and double-quoted values may contain actual newlines, we cannot split our input data on newlines as this may be in the middle of a quoted value. This, along with the other rules around handling quoted values, prompted me to try and implement a more robust parsing solution. This commit is my first stab at that.

Special Considerations

This is not a backwards-compatible change:

The dotenv files produced by this version of SOPS cannot be read by an earlier version.
The dotenv files produced by an earlier version of SOPS can be read by this version, with the understanding that the semantics around quotations and newlines have changed.

Examples

The below examples show how double-quoted values are passed to the running environment:

$ echo 'FOO="foo\\nbar\\nbaz"' > plaintext.env
$ sops -e --output ciphertext.env plaintext.env
$ sops exec-env ciphertext.env 'env | grep FOO | xxd'
00000000: 464f 4f3d 666f 6f5c 6e62 6172 5c6e 6261  FOO=foo\nbar\nba
00000010: 7a0a                                     z.

$ echo 'FOO="foo\nbar\nbaz"' > plaintext.env
$ sops -e --output ciphertext.env plaintext.env
$ sops exec-env ciphertext.env 'env | grep -A2 FOO | xxd'
00000000: 464f 4f3d 666f 6f0a 6261 720a 6261 7a0a  FOO=foo.bar.baz.

codecov-io · 2020-02-04T05:51:19Z

Codecov Report

Merging #622 into develop will increase coverage by 2.14%.
The diff coverage is 84.1%.

@@             Coverage Diff             @@
##           develop     #622      +/-   ##
===========================================
+ Coverage    36.98%   39.13%   +2.14%     
===========================================
  Files           21       22       +1     
  Lines         2893     3023     +130     
===========================================
+ Hits          1070     1183     +113     
- Misses        1728     1735       +7     
- Partials        95      105      +10

Impacted Files	Coverage Δ
stores/dotenv/store.go	`25% <100%> (-7.46%)`	⬇️
stores/dotenv/parser.go	`83.33% <83.33%> (ø)`
stores/flatten.go	`91.52% <0%> (+4.23%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1634350...fe789bf. Read the comment docs.

scjudd · 2020-02-05T17:20:11Z

Alright, this is now to a point where I'm content with the quality of it and I promise to not push anymore unless someone requests changes. 😄

scjudd · 2020-02-10T20:44:36Z

Okay so I lied. I added whitespace trimming to comments.

ajvb

Left some comments, but overall this looks awesome @scjudd! Thank you very much.

cmd/sops/subcommand/exec/exec.go

stores/dotenv/parser.go

Rationale ========= The dotenv store as it exists right now performs splitting on newlines to determine where a new key-value pair or comment begins. This works remarkably well, up until you need to handle values that contain newlines. While I couldn't find an offical dotenv file format spec, I sampled a number of open-source dotenv parsers and it seems that they typically apply the following rules: Comments: * Comments may be written by starting a line with the `#` character. Newline handling: * If a value is unquoted or single-quoted and contains the character sequence `\n` (`0x5c6e`), it IS NOT decoded to a line feed (`0x0a`). * If a value is double-quoted and contains the character sequence `\n` (`0x5c6e`), it IS decoded to a line feed (`0x0a`). Whitespace trimming: * For comments, the whitespace immediately after the `#` character and any trailing whitespace is trimmed. * If a value is unquoted and contains any leading or trailing whitespace, it is trimmed. * If a value is either single- or double-quoted and contains any leading or trailing whitespace, it is left untrimmed. Quotation handling: * If a value is surrounded by single- or double-quotes, the quotation marks are interpreted and not included in the value. * Any number of single-quote characters may appear in a double-quoted value, or within a single-quoted value if they are escaped (i.e., `'foo\'bar'`). * Any number of double-quote characters may appear in a single-quoted value, or within a double-quoted value if they are escaped (i.e., `"foo\"bar"`). Because single- and double-quoted values may contain actual newlines, we cannot split our input data on newlines as this may be in the middle of a quoted value. This, along with the other rules around handling quoted values, prompted me to try and implement a more robust parsing solution. This commit is my first stab at that. Special Considerations ====================== This is _not_ a backwards-compatible change: * The `dotenv` files produced by this version of SOPS _cannot_ be read by an earlier version. * The `dotenv` files produced by an earlier version of SOPS _can_ be read by this version, with the understanding that the semantics around quotations and newlines have changed. Examples ======== The below examples show how double-quoted values are passed to the running environment: ```console $ echo 'FOO="foo\\nbar\\nbaz"' > plaintext.env $ sops -e --output ciphertext.env plaintext.env $ sops exec-env ciphertext.env 'env | grep FOO | xxd' 00000000: 464f 4f3d 666f 6f5c 6e62 6172 5c6e 6261 FOO=foo\nbar\nba 00000010: 7a0a z. ``` ```console $ echo 'FOO="foo\nbar\nbaz"' > plaintext.env $ sops -e --output ciphertext.env plaintext.env $ sops exec-env ciphertext.env 'env | grep -A2 FOO | xxd' 00000000: 464f 4f3d 666f 6f0a 6261 720a 6261 7a0a FOO=foo.bar.baz. ```

autrilla

The code itself LGTM.

What parsers did you look at? Could we have the parser in a separate repository, perhaps? It might be useful to other people.

scjudd · 2020-02-27T04:17:47Z

Hey @autrilla, I can't remember all of the different parsers I looked at, but the rules defined for https://github.com/motdotla/dotenv#rules are what I based a lot of this on.

As far as breaking this out into a separate repo, I'm not opposed to that, but right now it is very purpose-built for SOPS. If you have any suggestions on what changes you'd like to see, I'm down to push this forward a little further. We need to land something along these lines to fix the issue for newlines and exec-env.

scjudd · 2020-03-19T06:22:53Z

Hey @autrilla, just checking in on this. Is there anything I can do to help get this landed?

autrilla · 2020-03-20T21:47:06Z

Sorry, I forgot about this. LGTM, thanks for the patch!

Rationale ========= The dotenv store as it exists right now performs splitting on newlines to determine where a new key-value pair or comment begins. This works remarkably well, up until you need to handle values that contain newlines. While I couldn't find an offical dotenv file format spec, I sampled a number of open-source dotenv parsers and it seems that they typically apply the following rules: Comments: * Comments may be written by starting a line with the `#` character. Newline handling: * If a value is unquoted or single-quoted and contains the character sequence `\n` (`0x5c6e`), it IS NOT decoded to a line feed (`0x0a`). * If a value is double-quoted and contains the character sequence `\n` (`0x5c6e`), it IS decoded to a line feed (`0x0a`). Whitespace trimming: * For comments, the whitespace immediately after the `#` character and any trailing whitespace is trimmed. * If a value is unquoted and contains any leading or trailing whitespace, it is trimmed. * If a value is either single- or double-quoted and contains any leading or trailing whitespace, it is left untrimmed. Quotation handling: * If a value is surrounded by single- or double-quotes, the quotation marks are interpreted and not included in the value. * Any number of single-quote characters may appear in a double-quoted value, or within a single-quoted value if they are escaped (i.e., `'foo\'bar'`). * Any number of double-quote characters may appear in a single-quoted value, or within a double-quoted value if they are escaped (i.e., `"foo\"bar"`). Because single- and double-quoted values may contain actual newlines, we cannot split our input data on newlines as this may be in the middle of a quoted value. This, along with the other rules around handling quoted values, prompted me to try and implement a more robust parsing solution. This commit is my first stab at that. Special Considerations ====================== This is _not_ a backwards-compatible change: * The `dotenv` files produced by this version of SOPS _cannot_ be read by an earlier version. * The `dotenv` files produced by an earlier version of SOPS _can_ be read by this version, with the understanding that the semantics around quotations and newlines have changed. Examples ======== The below examples show how double-quoted values are passed to the running environment: ```console $ echo 'FOO="foo\\nbar\\nbaz"' > plaintext.env $ sops -e --output ciphertext.env plaintext.env $ sops exec-env ciphertext.env 'env | grep FOO | xxd' 00000000: 464f 4f3d 666f 6f5c 6e62 6172 5c6e 6261 FOO=foo\nbar\nba 00000010: 7a0a z. ``` ```console $ echo 'FOO="foo\nbar\nbaz"' > plaintext.env $ sops -e --output ciphertext.env plaintext.env $ sops exec-env ciphertext.env 'env | grep -A2 FOO | xxd' 00000000: 464f 4f3d 666f 6f0a 6261 720a 6261 7a0a FOO=foo.bar.baz. ```

mltsy · 2020-07-23T17:51:54Z

Turns out this is a pseudo-breaking change for people using the dotenv format (including me), because the previous version of sops doesn't know what to do with the new quoted values (so in our build pipeline, which uses sops 3.5.0, it's unable to parse the new format coming from our local environments as people start upgrading to 3.6.0)

I'm not sure if there's a good place to put these notes. Ideally they wouldn't have been pointed out more prominently in the changelog - maybe I'll put in an issue to request a flag for --omit-dotenv-quotes or something in case others are encountering this issue...

This reverts commit 4507019.

…#706) This reverts commit 4507019.

patricknelson · 2020-12-10T07:08:42Z

Is this at all related to my issue at #784?

In my use case, I was loading environment variables which contain multiple lines (stored in encrypted JSON and YAML). It appears to be interpreted properly when viewing/editing directly within sops itself but not when loaded into environment variables via exec-env.

mltsy · 2020-12-10T14:47:17Z

Don't think it's related to this (this was only present in 3.5.0) - except that it's probably related to the dotenv encoder. Commented on your issue.

patricknelson · 2020-12-10T23:09:52Z

I think you're right @mltsy, I see the operative line that affects my use case were reverted in 5d32d9a. However, I will say that investigating my issue which is related via the shared use of the dotenv output formatting, I found a possible issue with how the dotenv output format encodes/decodes environment variables (i.e. it'll add \n's even though the variable in it's raw form may already contain \n's, meaning it is mixing unencoded content with encoded content and potentially mangling things). I have more details in my other comment.

Basically, I recommend using something like joho/godotenv to encode/decode dotenv values and then separating that process from the process used for calling os.StartProcess which may accept an entirely different environment variable encoding (something I don't know, since I'm totally new to golang).

Cherry-picked from: Add standard newline/quoting behavior to dotenv store (getsops#622) 4507019 Closes issue: https://github.com/arrikto/rok/issues/5563 Rationale ========= The dotenv store as it exists right now performs splitting on newlines to determine where a new key-value pair or comment begins. This works remarkably well, up until you need to handle values that contain newlines. While I couldn't find an offical dotenv file format spec, I sampled a number of open-source dotenv parsers and it seems that they typically apply the following rules: Comments: * Comments may be written by starting a line with the `#` character. Newline handling: * If a value is unquoted or single-quoted and contains the character sequence `\n` (`0x5c6e`), it IS NOT decoded to a line feed (`0x0a`). * If a value is double-quoted and contains the character sequence `\n` (`0x5c6e`), it IS decoded to a line feed (`0x0a`). Whitespace trimming: * For comments, the whitespace immediately after the `#` character and any trailing whitespace is trimmed. * If a value is unquoted and contains any leading or trailing whitespace, it is trimmed. * If a value is either single- or double-quoted and contains any leading or trailing whitespace, it is left untrimmed. Quotation handling: * If a value is surrounded by single- or double-quotes, the quotation marks are interpreted and not included in the value. * Any number of single-quote characters may appear in a double-quoted value, or within a single-quoted value if they are escaped (i.e., `'foo\'bar'`). * Any number of double-quote characters may appear in a single-quoted value, or within a double-quoted value if they are escaped (i.e., `"foo\"bar"`). Because single- and double-quoted values may contain actual newlines, we cannot split our input data on newlines as this may be in the middle of a quoted value. This, along with the other rules around handling quoted values, prompted me to try and implement a more robust parsing solution. This commit is my first stab at that. Special Considerations ====================== This is _not_ a backwards-compatible change: * The `dotenv` files produced by this version of SOPS _cannot_ be read by an earlier version. * The `dotenv` files produced by an earlier version of SOPS _can_ be read by this version, with the understanding that the semantics around quotations and newlines have changed. Examples ======== The below examples show how double-quoted values are passed to the running environment: ```console $ echo 'FOO="foo\\nbar\\nbaz"' > plaintext.env $ sops -e --output ciphertext.env plaintext.env $ sops exec-env ciphertext.env 'env | grep FOO | xxd' 00000000: 464f 4f3d 666f 6f5c 6e62 6172 5c6e 6261 FOO=foo\nbar\nba 00000010: 7a0a z. ``` ```console $ echo 'FOO="foo\nbar\nbaz"' > plaintext.env $ sops -e --output ciphertext.env plaintext.env $ sops exec-env ciphertext.env 'env | grep -A2 FOO | xxd' 00000000: 464f 4f3d 666f 6f0a 6261 720a 6261 7a0a FOO=foo.bar.baz. ```

scjudd force-pushed the develop branch from 1d3f0cb to 623fa10 Compare February 4, 2020 05:45

scjudd force-pushed the develop branch 17 times, most recently from ebc316d to ea7acae Compare February 5, 2020 17:17

scjudd force-pushed the develop branch from ea7acae to e12b1b5 Compare February 10, 2020 20:43

scjudd mentioned this pull request Feb 11, 2020

Deal with special char #628

Closed

scjudd requested a review from ajvb February 11, 2020 16:34

ajvb requested a review from autrilla February 11, 2020 20:15

ajvb approved these changes Feb 11, 2020

View reviewed changes

cmd/sops/subcommand/exec/exec.go Show resolved Hide resolved

stores/dotenv/parser.go Show resolved Hide resolved

stores/dotenv/parser.go Show resolved Hide resolved

scjudd force-pushed the develop branch 2 times, most recently from 06098a7 to 43df2c9 Compare February 12, 2020 02:37

scjudd force-pushed the develop branch from 43df2c9 to fe789bf Compare February 12, 2020 02:38

autrilla approved these changes Feb 14, 2020

View reviewed changes

autrilla merged commit 4507019 into getsops:develop Mar 20, 2020

mltsy mentioned this pull request Jul 23, 2020

dotenv format in 3.6 is incompatible with version 3.5 #702

Closed

akanix42 mentioned this pull request Jul 23, 2020

sops 3.6 fails to properly decrypt .env files created using sops 3.5 #705

Closed

autrilla added a commit that referenced this pull request Jul 24, 2020

Revert "Add standard newline/quoting behavior to dotenv store (#622)"

fa29bc8

This reverts commit 4507019.

autrilla mentioned this pull request Jul 24, 2020

Revert "Add standard newline/quoting behavior to dotenv store" #706

Merged

autrilla added a commit that referenced this pull request Jul 27, 2020

Revert "Add standard newline/quoting behavior to dotenv store (#622)" (…

5d32d9a

…#706) This reverts commit 4507019.

davidcaste mentioned this pull request Aug 3, 2020

sopssecretgenerator can't decrypt files encrypted with sops v3.6.0 goabout/kustomize-sopssecretgenerator#12

Closed

felixfontein mentioned this pull request Feb 10, 2024

INI and DotEnv stores are not roundtrip-safe, and quoting is generally a problem for these formats #1435

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add standard newline/quoting behavior to dotenv store #622

Add standard newline/quoting behavior to dotenv store #622

scjudd commented Feb 4, 2020 •

edited

codecov-io commented Feb 4, 2020 •

edited

scjudd commented Feb 5, 2020

scjudd commented Feb 10, 2020

ajvb left a comment

autrilla left a comment

scjudd commented Feb 27, 2020 •

edited

scjudd commented Mar 19, 2020

autrilla commented Mar 20, 2020

mltsy commented Jul 23, 2020

patricknelson commented Dec 10, 2020

mltsy commented Dec 10, 2020

patricknelson commented Dec 10, 2020 •

edited

Add standard newline/quoting behavior to dotenv store #622

Add standard newline/quoting behavior to dotenv store #622

Conversation

scjudd commented Feb 4, 2020 • edited

Rationale

Special Considerations

Examples

codecov-io commented Feb 4, 2020 • edited

Codecov Report

scjudd commented Feb 5, 2020

scjudd commented Feb 10, 2020

ajvb left a comment

Choose a reason for hiding this comment

autrilla left a comment

Choose a reason for hiding this comment

scjudd commented Feb 27, 2020 • edited

scjudd commented Mar 19, 2020

autrilla commented Mar 20, 2020

mltsy commented Jul 23, 2020

patricknelson commented Dec 10, 2020

mltsy commented Dec 10, 2020

patricknelson commented Dec 10, 2020 • edited

scjudd commented Feb 4, 2020 •

edited

codecov-io commented Feb 4, 2020 •

edited

scjudd commented Feb 27, 2020 •

edited

patricknelson commented Dec 10, 2020 •

edited