Created attachment 25045 [details]
I can pinpoint some (possibly all) races that happen in 3741d16503a973a99c724bdab9a255a5f07a3637 in combination with `add_stage_entry ()` and `process_stage_entries ()`. I will discuss the races here, please find the corresponding reports and stack traces attached.
To get these results, I stopped the GC from collecting the entries (immediately returned from `sgen_finalize_if ()` without executing anything) and lowered the number of entries to 100. Due to that, the draining process was started by `sgen_object_register_for_finalization ()` in a very predictable manner; and this is the root of all the reported data races: while `process_stage_entries ()` is wrapped into `LOCK_GC ... UNLOCK_GC`, `add_stage_entry ()` does not know about that (by design).
One thing that can occur is that two threads pick the same `next_entry`. One pauses here (let's call it "T1"), the other one stores its data successfully. More stuff happens, then the entries get drained due to an overflow and "T1" finally continues. At this point, there is no synchronisation between "T1" (checking `entries [index].state != STAGE_ENTRY_FREE`) and the thread that drains the entries (putting `entries [i].state = STAGE_ENTRY_FREE`) which results in #3 (and, vice versa, in a delayed way, in #5). This should not be a problem since `state` is `gint32` and these values are read and written within one instruction on "casual" platforms.
Of course, #4 occurs, when `process_stage_entries ()` sets `*next_entry = 0`. This happens without synchronisation between `process_stage_entries ()` and `add_stage_entry ()` (which reads this value quite often). This is a race (by design) but it is `gint32` again, and can therefore be considered to be harmless.
The other reports (#01, #02, #06, #07, #08, #09) look like false positives to me as TSan seems to have issues with understanding the data flow between `process_stage_entries ()` and `add_stage_entry ()´. With #06 - #09, the message "As if synchronized via sleep" seems to be spot on. #01 and #02 seem to be connected to the unsynchronised read of `state` in `process_stage_entries ()` (technicaly, that's another `gint32` data race): interlocking it (`state = InterlockedRead (&entries [i].state)`) mute these reports.
As far as I can tell, all memory barriers are in place, and I am unable to find a way in which any of these reports could be connected to a harmful execution order (besides having the described `gint32` races).