Discontinuities in pattern inference

Reidenbach, Daniel

Reidenbach_TCS_Discontinuities_Pattern_Inference.pdf (415.61 kB)

Discontinuities in pattern inference

journal contribution

posted on 2008-07-09, 08:14 authored by Daniel Reidenbach

This paper deals with the inferrability of classes of E-pattern languages—also referred to as extended or erasing pattern languages—from positive data in Gold’s model of identification in the limit. The first main part of the paper shows that the recently presented negative result on terminal-free E-pattern languages over binary alphabets does not hold for other alphabet sizes, so that the full class of these languages is inferrable from positive data if and only if the corresponding terminal alphabet does not consist of exactly two distinct letters. The second main part yields the insight that the positive result on terminal-free E-pattern languages over alphabets with three or four letters cannot be extended to the class of general E-pattern languages. With regard to larger alphabets, the extensibility remains open. The proof methods developed for these main results do not directly discuss the (non-)existence of appropriate learning strategies, but they deal with structural properties of classes of E-pattern languages, and, in particular, with the problem of finding telltales for these languages. It is shown that the inferrability of classes of E-pattern languages is closely connected to some problems on the ambiguity of morphisms so that the technical contributions of the paper largely consist of combinatorial insights into morphisms in word monoids.

History

School

Science

Department

Computer Science

Citation

REIDENBACH, D., 2008. Discontinuities in pattern inference. Theoretical Computer Science, 397 (1-3), pp. 166-193

Publisher

Publication date

2008

Notes

This is a journal article. It was published in the journal, Theoretical Computer Science [© Elsevier]. The definitive version is available at: http://dx.doi.org/doi:10.1016/j.tcs.2008.02.029