Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flattenSpec not working as documented #290

Open
donbowman opened this issue Feb 3, 2019 · 5 comments
Open

flattenSpec not working as documented #290

donbowman opened this issue Feb 3, 2019 · 5 comments

Comments

@donbowman
Copy link

No description provided.

@donbowman
Copy link
Author

I am using tranquility HTTP server with Druid 0.13. Its generally working except flattenSpec doesn't do what it should.

As you can see below, I have created a dataSource tsrc. I then write something with a nested json, and a flattenSpec to un-nest. I end up with a value for 'host' ('me'), but no value for 'action'. I don't see any errors.

This is tranquility 0.8.3

import datetime
import requests

msg = {
    "date": datetime.datetime.utcnow().isoformat() + 'Z',
    "host": "me",
    "values": {
     "action": "allow",
     "dummy": "foobar"
    }
}

auth = requests.auth.HTTPBasicAuth('USER','PASS')
r = requests.post('https://tranquility.DOMAIN/v1/post/tsrc', auth=auth, json=msg)
print(r.status_code)
    "dataSources": [
    {
      "spec" : {
        "dataSchema" : {
          "dataSource" : "tsrc",
          "parser" : {
            "type" : "string",
            "parseSpec" : {
              "format" : "json",
              "flattenSpec": {
                "useFieldDiscovery": true,
                "fields": [
                  { "type": "root", "name": "host", "expr": "dummy" },
                  { "type": "path", "name": "action", "expr": "$.values.action" }
                ]
              },
              "dimensionsSpec" : {
                "dimensions" : ["host","action"],
                "dimensionsExclusions": []
              },
              "timestampSpec": {
                "column": "date",
                "format": "auto"
              }
            }
          },
          "metricsSpec" : [],
          "granularitySpec" : {
            "type" : "uniform",
            "segmentGranularity" : "day",
            "queryGranularity" : "none",
            "rollup" : false
          }
        },
        "tuningConfig": {
          "type": "realtime",
          "intermediatePersistPeriod": "PT15M",
          "windowPeriod": "PT15M"
        }
      },
      "properties" : {
        "task.partitions" : "1",
        "task.replicants" : "1"
      }
    }
]

@donbowman
Copy link
Author

Looking in middleManager, the log doesn't show the flattenspec at all:

{                                                   
  "type" : "index_realtime",                                                                                                                                             
  "id" : "index_realtime_tsrc_2019-02-03T00:00:00.000Z_0_0",                                                                                                             
  "resource" : {                                                                                                                                                         
    "availabilityGroup" : "tsrc-2019-02-03T00:00:00.000Z-0000",                                                                                                          
    "requiredCapacity" : 1                                                                                                                                               
  },                                                                                                                                                                     
  "spec" : {                                                                                                                                                             
    "dataSchema" : {                                                                                                                                                     
      "dataSource" : "tsrc",                                                                                                                                             
      "parser" : {                                                                                                                                                       
        "type" : "map",                                                                                                                                                  
        "parseSpec" : {                                                                                                                                                  
          "format" : "json",                                                                                                                                             
          "timestampSpec" : {                                                                                                                                            
            "column" : "date",                                                                                                                                           
            "format" : "millis",                                                                                                                                         
            "missingValue" : null                                                                                                                                        
          },                                                                                                                                                             
          "dimensionsSpec" : {                                                                                                                                           
            "dimensions" : [ "host", "action" ],                                                                                                                         
            "spatialDimensions" : [ ]                                                                                                                                    
          }                                                                                                                                                              
        }                                                                                                                                                                
      },                                                                                                                                                                 
      "metricsSpec" : [ ],                                                                                                                                               
      "granularitySpec" : {                                                                                                                                              
        "type" : "uniform",                                                                                                                                              
        "segmentGranularity" : "DAY",                                                                                                                                    
        "queryGranularity" : {                                                                                                                                           
          "type" : "none"                                                                                                                                                
        },                                                                                                                                                               
        "rollup" : false,                                                                                                                                                
        "intervals" : null                                                                                                                                               
      },                                                                                                                                                                 
      "transformSpec" : {                                                                                                                                                
        "filter" : null,                                                                                                                                                 
        "transforms" : [ ]                                                                                                                                               
      }                                                                                                                                                                  
    },                                                                                                                                                                   
    "ioConfig" : {                                                                                                                                                       
      "type" : "realtime",                                                                                                                                               
      "firehose" : {                                                                                                                                                     
        "type" : "clipped",                                                                                                                                              
        "delegate" : {                                                                                                                                                   
          "type" : "timed",                                                                                                                                              
          "delegate" : {                                                                                                                                                 
            "type" : "receiver",                                                                                                                                         
            "serviceName" : "firehose:druid:overlord:tsrc-003-0000-0000",                                                                                                
            "bufferSize" : 100000,                                                                                                                                       
            "maxIdleTime" : 9223372036854775807                                                                                                                          
          },                                                                                                                                                             
          "shutoffTime" : "2019-02-04T00:20:00.000Z"                                                                                                                     
        },                                                                                                                                                               
        "interval" : "2019-02-03T00:00:00.000Z/2019-02-04T00:00:00.000Z"                                                                                                 
      },                                                                                                                                                                 
      "firehoseV2" : null                                                                                                                                                
    },                                                                                                                                                                   
    "tuningConfig" : {                                                                                                                                                   
      "type" : "realtime",                                                                                                                                               
      "maxRowsInMemory" : 75000,                                                                                                                                         
      "intermediatePersistPeriod" : "PT15M",                                                                                                                             
      "windowPeriod" : "PT15M",                                                                                                                                          
      "basePersistDirectory" : "/opt/apache-druid-0.13.0-incubating-SNAPSHOT/var/tmp/1549229845092-0",                                                                   
      "versioningPolicy" : {                                                                                                                                             
        "type" : "intervalStart"                                                                                                                                         
      },                                                                                                                                                                 
      "rejectionPolicy" : {                                                                                                                                              
        "type" : "none" 
      },                                                                                                                                                                 
      "maxPendingPersists" : 0,                                                                                                                                          
      "shardSpec" : {                                                                                                                                                    
        "type" : "linear",                                                                                                                                               
        "partitionNum" : 0                                                                                                                                               
      },                                                                                                                                                                 
      "indexSpec" : {                                                                                                                                                    
        "bitmap" : {                                                                                                                                                     
          "type" : "concise"                                                                                                                                             
        },                                                                                                                                                               
        "dimensionCompression" : "lz4",                                                                                                                                  
        "metricCompression" : "lz4",                                                                                                                                     
        "longEncoding" : "longs"                                                                                                                                         
      },                                                                                                                                                                 
      "buildV9Directly" : true,                                                                                                                                          
      "persistThreadPriority" : 0,                                                                                                                                       
      "mergeThreadPriority" : 0,                                                                                                                                         
      "reportParseExceptions" : false,                                                                                                                                   
      "handoffConditionTimeout" : 0,                                                                                                                                     
      "alertTimeout" : 0,                                                                                                                                                
      "segmentWriteOutMediumFactory" : null,                                                                                                                             
      "dedupColumn" : null                                                                                                                                               
    }                                                                                                                                                                    
  },                                                                                                                                                                     
  "context" : { },                                                                                                                                                       
  "groupId" : "index_realtime_tsrc",                                                                                                                                     
  "dataSource" : "tsrc"                                                                                                                                                  
}                                                                                                                                                                              

@donbowman
Copy link
Author

it appears this may only work if Content-Type: text/plain is used, and not when Content-Type: application/json

@gianm
Copy link
Member

gianm commented Feb 20, 2019

I think the issue is that Tranquility doesn't understand flattenSpecs and may not be passing them along. Would switching to Kafka indexing work for you?

@donbowman
Copy link
Author

its not clear to me that tranquility is still viable, after digging into it. it has the interface I want, but it doesn't work w/ the supervisor-model that the kafka ingest does, its got an older version of druid in its base, it will likely never move to newer jdk, ... so i gave up on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants