Channels
  • c

    crimsonknave

    1 year ago
    I'm seeing the following errors in my logs
    "err":"retrieve live queries: receive sql: redigo: nil returned"
    . They appear to be caused either be running a query live via
    fleetctl
    or ending one of those queries early. However, the errors keep popping up in the logs long after any queries are being run. They seem to spike up in regular 5 minute intervals, which is my
    logger_tls_period
    . Is there something I can do to fix these?
  • zwass

    zwass

    1 year ago
    Is 5 minutes also your distributed_interval?
  • c

    crimsonknave

    1 year ago
    No, that's 30 seconds
    But, I haven't run a live query in over an hour and I'm still seeing these messages come in
    Well, they're less regular in 5 minute increments now, but that could be due to more than one query stuck like this
    From what I can tell the redis is empty (but redis is a bit of a mystery to me)
  • zwass

    zwass

    1 year ago
    keys '*'
    returns nothing?
  • c

    crimsonknave

    1 year ago
    I'm using redis commander and there aren't any entries in the GUI. I got an error when I tried to run that in the commandline bit. Let me dig a bit more.
    The tree view should show keys, but it's empty. Which is what I recall when I had to go in and clean up the live queries before the cleanup logic was implemented.
    Also, we just updated to 3.10.1 from 3.1 (I think, it was old).
    I added a test key and can see it in the tree, so I'm pretty sure it's empty aside from that key.
  • zwass

    zwass

    1 year ago
    Okay thanks for all that info. I'm going to see if we can do some better cleanup for this in the next release.
  • c

    crimsonknave

    1 year ago
    Thanks! Two quick questions, are these sort of errors something I should be worried about? Are the expected if I run or cancel a query?
  • zwass

    zwass

    1 year ago
    I don't think you need to worry about them. We're being overly noisy about cleaning up older queries.
    But it is strange that you still see them even though Redis is empty... You shouldn't be able to hit that code path if Redis is empty.
  • c

    crimsonknave

    1 year ago
    Anything else I can do to debug right now?
  • zwass

    zwass

    1 year ago
    Is it still happening even after verifying Redis empty?
  • c

    crimsonknave

    1 year ago
    Yup, I also redeployed our fleet (in kubernetes)
  • zwass

    zwass

    1 year ago
    Are you seeing those errors in the Fleet server logs or in the logs that osquery clients are writing to Fleet?
    Can you paste a full log line?
  • c

    crimsonknave

    1 year ago
    That was from Fleet, one sec. I may have misspoken.
    I am still seeing them. 9k in the last 15 minutes.
    {
      "component": "service",
      "err": "retrieve live queries: receive sql: redigo: nil returned",
      "ip_addr": "10.125.6.0:19173",
      "level": "info",
      "method": "GetDistributedQueries",
      "took": "12.15416ms",
      "ts": "2021-04-09T18:50:47.401563414Z",
      "x_for_ip_addr": "10.127.50.32"
    }
  • zwass

    zwass

    1 year ago
    Is it possible your Redis UI is connected to a different DB than Fleet? Can you verify by running a live query and seeing that some keys appear?
  • c

    crimsonknave

    1 year ago
    Running one now. I see
    livequery
    and
    sql

    Canceled it and those went away
  • zwass

    zwass

    1 year ago
    And does live query actually work? I'm just looking at the code and it seems like it should not be possible to hit that line if there are no keys at all.
  • c

    crimsonknave

    1 year ago
    Yup, last time I let it run for a while I got
    ⠓ 59% responded (100% online) | 7169/12159 targeted hosts (7169/7205 online)
    before it felt like it wasn't going to get any more.
    And a bunch of data flowed in