@keiyakins @halcy also the difference between something that can match regular expressions and something that can parse XML (okay, HTML is a different beast) is surprisingly small: intersection and complement (which are still regular) plus recursion (which takes it past context-free into Boolean grammars).
entirely unrelated to the original point by now but I've been studying these things lately and I hope it was interesting 😅