Ok, so you're arguing that userspace should use mutexes and not spinlocks.
Which I agree with, most of the time userspace spinlocks don't fit well.
But TFA is clearly comparing the two, and observing a variety of spinlock implementations with sched_yield() demonstrating an interesting positive effect on the spinlocks as tested.
Actually no, I wouldn't make that claim; while a futex based adaptive mutex is a very good default, spinlocks can be still approriate for some applications.
What I'm saying is that if your use case is such that you expect enough contention to consider using TATAS (which is actually a pessimization in the uncontented case) and look into optimizing sched_yield, probably a spinlock is not appropriate on the first place.
Edit: hence a spinlock shouldn't bother with yield and just do a tight xchg spin (I haven't measured it inna while, but heard rumors that pause can severely harm acquire latency on very recent CPUs as it will quickly put them in a deeper power saving mode than in the past)
Which I agree with, most of the time userspace spinlocks don't fit well.
But TFA is clearly comparing the two, and observing a variety of spinlock implementations with sched_yield() demonstrating an interesting positive effect on the spinlocks as tested.