Channels
  • n

    nyanshak

    1 year ago
    Label / Additional host query question <thread>
    Currently (as I understand it):
    osquery_label_update_interval
    and
    osquery_detail_update_interval
    both default to
    1h
    . An idea to improve this: • support per-query intervals (e.g., in the list of label queries, allow queryA to run once per hour and queryB runs once every 24h) I know that intervals here are highly correlated to load, so this might not actually be feasible.
    Probably separate idea: • For label / detail queries, it would be nice to support
    run-once
    semantics (to only collect the data one time). It would be good to be able to have different intervals for labels for different *Teams*. I expect at least the interval part of this will already be supported when teams comes out 🤔 • For scheduled queries / if fleet adds auto-updating client (like launcher), it would be good if there were a way for the launcher to support
    run-immediately
    (optionally?) where scheduled queries are run once right away, as well as at every
    interval
    afterward. This is a perpetual annoyance of mine at osquery, and I have some ideas around why it's not supported right now 🤷‍♀️
  • Noah Talerman

    Noah Talerman

    1 year ago
    Hey @nyanshak why would it be helpful to support per-query update intervals? Are both the details and label use cases related to the updating the groups of hosts that are targeted? Is the motivation for
    run-once
    semantics for labels / details similar to the motivation for manual labels? From my understanding this motivation is tied to the result set never updating Lastly, why would it be helpful to have the
    run-immediately
    option? What annoyance would this help you get rid of.
  • n

    nyanshak

    1 year ago
    why would it be helpful to have the 
    run-immediately
     option? There's some queries that we want to run regularly, but infrequently. Because osquery waits until the first
    interval
    has passed to execute the query the first time, this means that ephemeral hosts will _never_ execute some queries. Even though we don't want these queries to run super-frequently (every 10 minutes, every hour, whatever), we _do_ want them to run.
    run-once
    ^ There are certain attributes in some environments that we _never_ change. To be more specific, in this env, hosts are created from an AMI and are immutable. If there is _ever_ any change, a new AMI would be build, the original instances destroyed, and new ones spun up to replace the original (old versioned) instances.
    And we could reduce the overhead of running osquery by only collecting the information once, as that's the only time it would be necessary.
    why would it be helpful to support per-query update intervals? Mostly that there are some attributes that never change or very infrequently change, so it's not useful to run at the same interval as other queries.
    Are both the details and label use cases related to the updating the groups of hosts that are targeted? Sort of related, not exactly the same.
    For details: collecting metadata about a host, some of which could only infrequently change (and needs different intervals). For labels: this is definitely for targeting hosts, but some attributes very rarely change, and we want to reduce overhead where we can.
  • Noah Talerman

    Noah Talerman

    1 year ago
    Even though we don’t want these queries to run super-frequently (every 10 minutes, every hour, whatever), we _do_ want them to run. This makes sense. Zach is planning on bringing up this use case during today’s osquery office hours (I believe office hours just started). The thought is that it makes sense for the solution you proposed, or a similar solution, to make its way into osquery
  • n

    nyanshak

    1 year ago
    👍 can't attend but may watch the recording after
  • Noah Talerman

    Noah Talerman

    1 year ago
    Sweet!
    I also now better understand the pain point for unnecessarily updating attributes. Thank you for the explanation. Is solving the “reducing osquery overhead” problem closer to a nice-to-have? A somewhat related question: Does providing proof of reducing osquery overhead allow for increased confidence in osquery’s performance and thus an easier time convincing other individuals of installing the agent? Or is there a different ultimate goal for the reducing of overhead?
  • n

    nyanshak

    1 year ago
    Does providing proof of reducing osquery overhead allow for increased confidence in osquery’s performance and thus an easier time convincing other individuals of installing the agent? Yes
    is there a different ultimate goal for the reducing of overhead? Ultimately, CPU cycles and memory used by osquery are costs incurred by thousands of machines, and may be the difference between (for example) running a t2.micro and t2.small instance (or whatever equivalent single upgrade instance class would be). It also costs more in network & storage costs, etc.
    While on an individual query basis, the cost may be small, but in aggregate with many queries across many hosts, the cost becomes larger and more meaningful.