nyanshak
03/08/2021, 11:51 PMosquery_label_update_interval
and osquery_detail_update_interval
both default to 1h
.
An idea to improve this:
• support per-query intervals (e.g., in the list of label queries, allow queryA to run once per hour and queryB runs once every 24h)
I know that intervals here are highly correlated to load, so this might not actually be feasible.run-once
semantics (to only collect the data one time). It would be good to be able to have different intervals for labels for different Teams. I expect at least the interval part of this will already be supported when teams comes out 🤔
• For scheduled queries / if fleet adds auto-updating client (like launcher), it would be good if there were a way for the launcher to support run-immediately
(optionally?) where scheduled queries are run once right away, as well as at every interval
afterward. This is a perpetual annoyance of mine at osquery, and I have some ideas around why it's not supported right now 🤷♀️Noah Talerman
03/16/2021, 3:36 PMrun-once
semantics for labels / details similar to the motivation for manual labels? From my understanding this motivation is tied to the result set never updating
Lastly, why would it be helpful to have the run-immediately
option? What annoyance would this help you get rid of.nyanshak
03/16/2021, 3:47 PMwhy would it be helpful to have theThere's some queries that we want to run regularly, but infrequently. Because osquery waits until the firstoption?run-immediately
interval
has passed to execute the query the first time, this means that ephemeral hosts will never execute some queries.
Even though we don't want these queries to run super-frequently (every 10 minutes, every hour, whatever), we do want them to run.run-once
^ There are certain attributes in some environments that we never change. To be more specific, in this env, hosts are created from an AMI and are immutable. If there is ever any change, a new AMI would be build, the original instances destroyed, and new ones spun up to replace the original (old versioned) instances.why would it be helpful to support per-query update intervals?Mostly that there are some attributes that never change or very infrequently change, so it's not useful to run at the same interval as other queries.
Are both the details and label use cases related to the updating the groups of hosts that are targeted?Sort of related, not exactly the same. For details: collecting metadata about a host, some of which could only infrequently change (and needs different intervals). For labels: this is definitely for targeting hosts, but some attributes very rarely change, and we want to reduce overhead where we can.
Noah Talerman
03/16/2021, 5:09 PMEven though we don’t want these queries to run super-frequently (every 10 minutes, every hour, whatever), we do want them to run.This makes sense. Zach is planning on bringing up this use case during today’s osquery office hours (I believe office hours just started). The thought is that it makes sense for the solution you proposed, or a similar solution, to make its way into osquery
nyanshak
03/16/2021, 5:10 PMNoah Talerman
03/16/2021, 5:11 PMnyanshak
03/16/2021, 5:21 PMDoes providing proof of reducing osquery overhead allow for increased confidence in osquery’s performance and thus an easier time convincing other individuals of installing the agent?Yes
is there a different ultimate goal for the reducing of overhead?Ultimately, CPU cycles and memory used by osquery are costs incurred by thousands of machines, and may be the difference between (for example) running a t2.micro and t2.small instance (or whatever equivalent single upgrade instance class would be). It also costs more in network & storage costs, etc. While on an individual query basis, the cost may be small, but in aggregate with many queries across many hosts, the cost becomes larger and more meaningful.
Noah Talerman
03/16/2021, 7:09 PMWhile on an individual query basis, the cost may be small, but in aggregate with many queries across many hosts, the cost becomes larger and more meaningful.Got it. This is all new to me so I have some reading to do on costs incurred by machines. This is an awesome breadcrumb to start that research