posted on 2021-07-21, 14:56authored byMurphy Berzish, Mitja Kulczynski, Federico Mora, Florin Manea, Joel DayJoel Day, Dirk Nowotka, Vijay Ganesh
We present a novel length-aware solving algorithm for the quantifier-free first-order theory over regex membership predicate and linear arithmetic over string length. We implement and evaluate this algorithm and related heuristics in the Z3 theorem prover. A crucial insight that underpins our algorithm is that real-world regex and string formulas contain a wealth of information about upper and lower bounds on lengths of strings, and such information can be used very effectively to simplify operations on automata representing regular expressions. Additionally, we present a number of novel general heuristics, such as the prefix/suffix method, that can be used to make a variety of regex solving algorithms more efficient in practice. We showcase the power of our algorithm and heuristics via an extensive empirical evaluation over a large and diverse benchmark of 57256 regex-heavy instances, almost 75% of which are derived from industrial applications or contributed by other solver developers. Our solver outperforms five other state-of-the-art string solvers, namely, CVC4, OSTRICH, Z3seq, Z3str3, and Z3-Trau, over this benchmark, in particular achieving a speedup of 2.4 × over CVC4, 4.4 × over Z3seq, 6.4 × over Z3-Trau, 9.1 × over Z3str3, and 13 × over OSTRICH.
Funding
The work of Federico Mora is supported by NSF grants CNS-1739816 and CCF-1837132, by the DARPA LOGiCS project under contract FA8750-20-C-0156, by the iCyPhycenter, and by gifts from Intel, Amazon, and Microsoft. The work of Florin Manea is supported by the DFG grant 389613931.
History
School
Science
Department
Computer Science
Published in
Computer Aided Verification
Volume
Part II
Pages
289-312
Source
33rd International Conference on Computer-Aided Verification (CAV 2021)
This is an Open Access Chapter. It is published by Springer under the Creative Commons Attribution 4.0 International Licence (CC BY 4.0). Full details of this licence are available at: https://creativecommons.org/licenses/by/4.0/