I am facing an issue with process events table when a pid is

Question

I am facing an issue with process events table when a pid is reused the join with different tables can provide wrong results did you ever think to implement an additional random larger than uint16 pid most of the EDRs implement this to avoid pid reuse

zwass · Answer

This is an interesting one I d be curious to know more about how other tools deal with it Can you open an issue on GitHub with some description of that

seph · Answer

I m curious how that works in practice and how it helps

Mike Myers · Answer

Maybe osquery could tag the process events with an additional randomly generated ID for uniqueness in the logs but I don t think it would avoid the issue you re seeing which I bet is a race condition between two point in time queries that constitute the JOIN This is a design limitation I think < alessandrogario> might know differently

seph · Answer

I m hesitant to suggest add a ulid without understanding the problem it would solve I don t think it would help correlate between osquery tables Best case it provides a unique identifier for external systems but said external systems should be able to make their own unique identifiers

alessandrogario · Answer

How would an additional ID solve this issue We need a way to map pid gt internal pid that is never reused gt back to pid

alessandrogario · Answer

we can come up with a way to generate it but when joining we always end up using a standard pid once again

seph · Answer

Yes exactly And it s extra state to track

alessandrogario · Answer

i e we decide to use SPECIAL UUID convert it to pid and then access proc lt pid gt

alessandrogario · Answer

but that pid may have been reused anyway

alessandrogario · Answer

there should be a setting named max pid that can be tweaked

alessandrogario · Answer

in linux

alessandrogario · Answer

best option would be to stop reusing them but I m not sure if the kernel can be configured to avoid that

alessandrogario · Answer

maybe there s something we can do but i have to test some stuff first

seph · Answer

AFAIK the underlying os apis use pid for correlation So I don t see what there is to do

Mike Myers · Answer

I somewhat recall PID reuse behavior being OS specific too

vaar · Answer

yeah it is not easy to solve with osquery some EDRs build an internal mapping with process start time and pid to have an unique identifier for process in case of pid reuse so in osquery can be easy to have it in processes table but not for other tables with pid field

alessandrogario · Answer

Yeah I actually had an idea for this the other day 1 We add support for a secondary process id internally we can use pid timestamp so that we don t have state to track 2 Add support for using that ID in SQL 3 Update our utilities that scan proc so that opendir on the pid folder under proc in order to lock it then fstat to check for the timestamp return as ENOENT if they don t match cc < theopolis>

alessandrogario · Answer

if it s interesting we can open a blueprint issue so that people can weight the pros and cons in implementing something like this

seph · Answer

Is the intent there so that if you join between two tables something implied like the pid timestamp will prevent joins from breaking That seems clever Though I wonder what problem we re solving

seph · Answer

Is it joins in a short time interval Or is it archival data in some SIEM

alessandrogario · Answer

How short the interval is depends on max pid osquery event expiration how often events are generated how often scheduled queries are hitting those tables

alessandrogario · Answer

but yeah a blueprint could give us more feedback from users

alessandrogario · Answer

it doesn t seem like archival is required now unless I m wrong