Loughborough University
Browse

An SMT solver for regular expressions and linear arithmetic over string length

Download (855.9 kB)
conference contribution
posted on 2021-07-21, 14:56 authored by Murphy Berzish, Mitja Kulczynski, Federico Mora, Florin Manea, Joel DayJoel Day, Dirk Nowotka, Vijay Ganesh
We present a novel length-aware solving algorithm for the quantifier-free first-order theory over regex membership predicate and linear arithmetic over string length. We implement and evaluate this algorithm and related heuristics in the Z3 theorem prover. A crucial insight that underpins our algorithm is that real-world regex and string formulas contain a wealth of information about upper and lower bounds on lengths of strings, and such information can be used very effectively to simplify operations on automata representing regular expressions. Additionally, we present a number of novel general heuristics, such as the prefix/suffix method, that can be used to make a variety of regex solving algorithms more efficient in practice. We showcase the power of our algorithm and heuristics via an extensive empirical evaluation over a large and diverse benchmark of 57256 regex-heavy instances, almost 75% of which are derived from industrial applications or contributed by other solver developers. Our solver outperforms five other state-of-the-art string solvers, namely, CVC4, OSTRICH, Z3seq, Z3str3, and Z3-Trau, over this benchmark, in particular achieving a speedup of 2.4 × over CVC4, 4.4 × over Z3seq, 6.4 × over Z3-Trau, 9.1 × over Z3str3, and 13 × over OSTRICH.

Funding

The work of Federico Mora is supported by NSF grants CNS-1739816 and CCF-1837132, by the DARPA LOGiCS project under contract FA8750-20-C-0156, by the iCyPhycenter, and by gifts from Intel, Amazon, and Microsoft. The work of Florin Manea is supported by the DFG grant 389613931.

History

School

  • Science

Department

  • Computer Science

Published in

Computer Aided Verification

Volume

Part II

Pages

289-312

Source

33rd International Conference on Computer-Aided Verification (CAV 2021)

Publisher

Springer

Version

  • VoR (Version of Record)

Rights holder

© The Authors

Publisher statement

This is an Open Access Chapter. It is published by Springer under the Creative Commons Attribution 4.0 International Licence (CC BY 4.0). Full details of this licence are available at: https://creativecommons.org/licenses/by/4.0/

Acceptance date

2021-05-31

Publication date

2021-07-15

Copyright date

2021

ISBN

9783030816872; 9783030816889

ISSN

0302-9743

eISSN

1611-3349

Book series

Lecture Notes in Computer Science; 12760

Language

  • en

Editor(s)

Alexandra Silva; K. Rustan M. Leino

Location

Online

Event dates

18th July 2021 - 24th July 2021

Depositor

Dr Joel Day. Deposit date: 21 July 2021

Usage metrics

    Loughborough Publications

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC