Channels
  • c

    ccc

    3 years ago
    i can’t figure out why my unwanted chrome extensions query isn’t working.
    SELECT * FROM users JOIN chrome_extensions USING (uid);
    works and returns the list of extensions, but
    SELECT * FROM users JOIN chrome_extensions USING (uid) WHERE identifier='gkojfkhlekighikafcpjkiklfbnlmeio';
    returns empty even though its there
  • zwass

    zwass

    3 years ago
    try replacing
    JOIN
    with
    CROSS JOIN
  • c

    ccc

    3 years ago
    oh sweet that works
    is this something wrong with my set up or should the packs be updated to use cross join?
  • zwass

    zwass

    3 years ago
    Use
    CROSS JOIN
    for those queries where you need to join against the users table. The sqlite optimizer will sometimes change the order of the join otherwise and cause false negatives.
  • clong

    clong

    3 years ago
    @zwass do you think the query packs should be updated to use cross join everywhere that a query is being joined against the users table?
  • zwass

    zwass

    3 years ago
    Yes. But when that is done, we need to ensure that the users table appears before the
    cross join
    .
  • clong

    clong

    3 years ago
    ah okay, that’s good to know as i think we’re doing that in a LOT of places
  • defensivedepth

    defensivedepth

    3 years ago
    @zwass We should get that snippet of wisdom somewhere in the docs / FAQ
  • v

    Vibhor

    3 years ago
    Instead of CROSS JOIN, I would recommend to use subquery or CTE
    Something like:
    SELECT * 
    FROM   ( 
                      SELECT     * 
                      FROM       users 
                      cross join chrome_extensions 
                      using      (UID)) foo 
    WHERE  foo.identifier = 'blpcfgokakmgnkcojhhkbfbldkacnbeo' ;
    
    WITH foo AS 
    ( 
               SELECT     * 
               FROM       users 
               cross join chrome_extensions 
               using      (UID)) 
    SELECT * 
    FROM   foo 
    WHERE  identifier = 'blpcfgokakmgnkcojhhkbfbldkacnbeo';

    Please note CROSS JOIN or Cartieian product may give wrong results and can be resource intensive, depending on how much data its going to generate
  • gguli

    gguli

    3 years ago
    Sounds like a bug. Thanks for details @Vibhor @zwass
    Bad thing about cross join is that, we will prohibit sqlite for any optimization. Will open issue and reference to that discussion
    Hmm, in my case, it actually works. Is it OS specific thing?(Tried on macOS)
  • v

    Vibhor

    3 years ago
    What is the version of MacOSX? I can reproduce the problem on Mac OSX 10.14.3
  • zwass

    zwass

    3 years ago
    @Vibhor you are making assertions here that are not correct. Please see the SQLite docs (https://www.sqlite.org/lang_select.html). I’ll include some excerpts from the docs below:
    All joins in SQLite are based on the cartesian product of the left and right-hand datasets.

    There is no difference between the "INNER JOIN", "JOIN" and "," join operators. They are completely interchangeable in SQLite. The "CROSS JOIN" join operator produces the same result as the "INNER JOIN", "JOIN" and "," operators, but is handled differently by the query optimizer in that it prevents the query optimizer from reordering the tables in the join.

    See also https://www.sqlite.org/optoverview.html#crossjoin
    Note the docs indicate that the results will be the same, but this is not the case because changing the loop nesting ordering actually changes the results of the tables in osquery (remember that some tables only return data when they have the appropriate constraints in the WHERE clause). This would not be a property that the SQLite optimizer expects.
    There is also a good explanation for why it works for some folks and not others: the optimizer is making different decisions about reordering tables.
    @User per the docs, cross join will not prohibit all optimizations... it will just prohibit the one that causes incorrect results.
    @Vibhor it is entirely possible to rewrite this query without cross join by using a sub query and IN clause, however your examples both still include cross join.
  • v

    Vibhor

    3 years ago
    Hi @zwass I didn’t check the sqlite doc. However its too bad that there is no difference in sqlite engine for CROSS JOIN and INNER JOIN. IIRC, CROSS JOIN and INNER JOIN are two different operators. In set theory, its A x B Vs A intersection B.
    In future if sqlite team fix this, then whoever is using cross join querypack will end up in different issue (that is re-writing their query packs)
    Also, in my example, I used inner join not really cross join. Its a different case, that currently sqlite doesn’t differentiate between these two operators
  • zwass

    zwass

    3 years ago
    Please read the docs
    There is a variety of misinformation you are providing.
  • v

    Vibhor

    3 years ago
  • zwass

    zwass

    3 years ago
    The SQLite team will not fix this because it is caused by the osquery table implementation violating laws of the relational algebra.
  • v

    Vibhor

    3 years ago
    I see. My comments were based on general differences between two operators
  • zwass

    zwass

    3 years ago
    You are linking images for SQL server as far as I can tell. From the SQLite docs:
    All joins in SQLite are based on the cartesian product of the left and right-hand datasets. If the join-operator is "CROSS JOIN", "INNER JOIN", "JOIN" or a comma (",") and there is no ON or USING clause, then the result of the join is simply the cartesian product of the left and right-hand datasets.

    Also
    The special handling of CROSS JOIN is an SQLite-specific feature and is not a part of standard SQL.
  • v

    Vibhor

    3 years ago
  • zwass

    zwass

    3 years ago
    Any join without a constraint generates a Cartesian product.
    It’s up to the query optimizer to apply any constraints appropriately so as to not actually waste the cpu generating the Cartesian product and then selecting on it.
    @User one more thing I realize is that you must run osquery as the root user in order to see the bug (because the table implementations only violate the relational algebra laws when running as root).
  • gguli

    gguli

    3 years ago
    Thank you for looking into that @zwass. As you mentioned, osquery is definitely very strange case for sqlite, because of required, additional, optimization queries. (enum ColumnOptions). We, trick sqlite to provide us required column constraints when possible virtual_table.cpp:767 . Do you think, similar thing can help sqlite in case of ADDITIONAL and OPTIMIZED?
  • zwass

    zwass

    3 years ago
    I think it may be possible to nudge sqlite into using the correct ordering for these tables that require the constraint.
    Perhaps by playing with https://www.sqlite.org/vtab.html#the_xbestindex_method
    Actually @User they even give some explicit advice that looks relevant to our situation: https://www.sqlite.org/vtab.html#return_value. Tables like
    chrome_extensions
    actually are table valued functions with required constraints.
  • gguli

    gguli

    3 years ago
    Did not know about that. It's cool, we can use SQLITE_CONSTRAINT for required columns and manipulate with high costs in case of ADDITIONAL and OPTIMIZED. Are you be working on that or should I put it in my todo list? 😄
  • zwass

    zwass

    3 years ago
    I'm taking a look into it right now. Currently struggling with my buck build which seems to be picking up incorrect dependencies from homebrew.
  • gguli

    gguli

    3 years ago