Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shorter identifiers for better log readability #640

Open
pmarini-nc opened this issue Jan 18, 2024 · 0 comments
Open

Shorter identifiers for better log readability #640

pmarini-nc opened this issue Jan 18, 2024 · 0 comments

Comments

@pmarini-nc
Copy link

pmarini-nc commented Jan 18, 2024

It would be great to have shorter identifiers in the log. This would improve log readability and, according to a simple analysis I did, it wouldn't generate false duplicates if we were to trim the identifiers to 10 letters in a server with around 10-15 active users.

Or am I missing the reason to put the full identifier in the logs?

The analysis I did is the following:

  1. output the journald log (around 500k entries, around 3 months of history)
    journalctl --output=json --output-fields=MESSAGE --unit ncsignaling > /tmp/ncsignaling.log
  2. process the output with the following python script:
import pandas as pd

def find_largest(ws):
    s_out = ""
    max_len_s = len(s_out)
    for s in ws.split():
        if len(s) >= max_len_s:
            max_len_s = len(s)
            s_out = s
    if max_len_s < 20:
	    s_out = None
    return s_out

df1 = pd.read_json("ncsignaling.log", lines=True)

df1["component"] = df1.MESSAGE.str.split(":").apply(lambda x:x[0])

df2 = df1.query("component=='client.go'")

df2["msg"] = df2.MESSAGE.str.split(":").apply(lambda x:x[2])

df2["identifier"] = df2.msg.apply(find_largest)

df3=df2[df2.identifier.notnull()]

df3["short_id"] = [ identifier[:10] for identifier in df3.identifier.values]

print("Unique Identifiers: %i" % len(df3.identifier.unique()))
print("Unique Short Identifiers: %i" % len(df3.short_id.unique()))

The output is:

Unique Identifiers: 11681
Unique Short Identifiers: 11681

I'm restricting the analysis to events related to client.go for simplicity, and the analysis is naive, but I guess it makes my point clear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant