This post is about the notion of Vapnik-Chervonenkis dimension — a concept that arose more-or-less simultaneously in probability theory, in model theory (the part of mathematical logic dealing with abstract structures), and in combinatorics. It is fascinating how this concept which just has to do with set systems (with no additional structure) ends up having connections with so many different areas of mathematics — including very recent developments in model theory due to Hrushovski and Loeser.
The goal of this post is to give readers (with diverse backgrounds) a glimpse of these connections/applications. I have skimped on many details obviously — the idea being to show just enough to whet the appetite of the reader to learn more about the topic.
Since the post is somewhat long I have organized it into sections and the reader might jump directly to the section most interesting to her. The sections are independent of each other (modulo the initial definitions and notation provided in Section 1). They are enumerated as follows.
- In Section 1 I give the basic definitions and discuss the famous “Dichotomy” lemma.
- In Section 2 I discuss applications to probability theory.
- In Section 3 I discuss applications to discrete and computational geometry.
- In Section 4 I discuss connections with model theory (assuming no background in logic).
- In Section 5 I relate VC density with the more geometric notion of
-patterns.
- In Section 6 I discuss the “cohomological method” for bounding the VC codensity of (or equivalently the number of
-patterns realized by) a family of subsets of a given set.
- In Section 7 I review briefly some history of the quantitative problem of obtaining tight upper bounds on the VC codensity of definable families in different structures.
- Finally, in Section 8 I discuss recent work with Deepam Patel on obtaining tight upper bounds on the VC density of definable families over a non-archimedean valued field.
Standard disclaimer: It goes without saying that the views expressed are mine and may not reflect those of my co-authors/co-conspirators. The LaTeX to WordPress conversion was done using the latex2wp package written by Luca Trevisan. Thanks Luca!
1. Origin and Definition
The Vapnik-Chervonenkis (henceforth VC) dimension (as well as density) of a set
of subsets of a set
is a combinatorial notion which is motivated independently by several different areas of mathematics. Its origin can be traced to probability theory (Vapnik and Chervonenkis), model theory (Shelah), and combinatorics (Sauer) — all around the same time.
For any subset
of
, we denote

We say that a (finite) subset
of
, is shattered by
, if

The VC dimension,
, is the largest
, such that there exists a subset
with
which is shattered by
. If no such
exists, we set
.
The definition of VC definition is best motivated by the following lemma due (independently) to Sauer and Shelah.
Sauer, N. (1972), “On the density of families of sets”, Journal of Combinatorial Theory, Series A, 13: 145-147, doi:10.1016/0097-3165(72)90019-2.
Shelah, Saharon (1972), “A combinatorial problem; stability and order for models and theories in infinitary languages”, Pacific Journal of Mathematics, 41: 247-261, doi:10.2140/pjm.1972.41.247.
In the statement of the lemma and later we will use the convenient notation

to denote

Lemma 1 (Dichotomy) Suppose that
. Then for all
, with
,

In particular, there exists
, such that for all
,

The name “Dichotomy” is justified by the following observation.
We denote

Lemma 1 implies that exactly one of the following two alternatives is true.
;
- there exists
such that
.
The VC density, denoted
, of
is defined by

It is an immediate consequence of Lemma 1 that:

The above inequality can be strict. However, it follows from Lemma 1 that that

Later on in this post we will focus mostly on VC density rather than on VC dimension, since VC density can be related to more geometric notions (such as
-patterns to be defined later).
2. Application in probability theory
The original motivation that led Vapnik and Chevonenkis to define VC dimension came from probability theory — namely, to provide necessary and sufficient condition for uniform law of large numbers to hold in certain situation. We describe this briefly below.
Let
be a sample space on which a probability measure is defined, and
be a collection of measurable subsets of
. If
are samples drawn independently from
with the given probability measure, then for every fixed
, the random variable

goes to
in probability as
. This is just the law of large numbers.
However, if we want the convergence to zero to be uniform i.e. we want the random variable

to go to
in probability as
, then we need to impose further conditions (in the way we have stated this it is not clear that that the function on the left hand side is measurable, but let us assume that it is. It will be measurable if we should assume that
is countable for instance).
Vapnik and Chervonenkis proved in their paper:
Vapnik, V. N; Chervonenkis, A. Ya. On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities, Theory of Probability and its Applications; Philadelphia Vol. 16, Iss. 2, (1971): 17. DOI:10.1137/1116025
that a suffcient condition for the uniform convergence is

More precisely they proved (Theorem 2, loc. cit.) that for every
, 
In particular, if
is finite, and hence
(cf. (1)) is polynomially bounded in
by Lemma 1, then we have uniform convergence to
in probability.
3. Application in discrete and computational geometry
VC dimension has played an extremely important role in discrete and computational geometry — in particular, in the algorithmic problem often referred to as “range searching”.
Reusing terminology from before let
be a set and
. In the context of computational geometry,
is often a finite dimensional real vector space, and elements of
are often a class of semi-algebraic sets having “constant description complexity” (for example,
, and
the set of all unit disks, or half spaces, or triangles etc.). Elements of
are referred to as “range” sets. The range searching problem of takes as input given a finite
set of
points in
. The goal is to “preprocess” them, so that given a query “range”
, one can report quickly properties of the set
: for example, the cardinality of
, or whether
is empty or not. One can find details in the survey article:
Agarwal, Pankaj K. “Simplex range searching and its variants: a review” A journey through discrete mathematics, 1–30, Springer, Cham, 2017.
A key tool in efficient solutions proposed for the range searching problem is the notion of of
-nets (and the related notion of
-approximations). The idea is to represent the given set
with a very small sized “representative” subset in
which is guaranteed to have good properties with respect to the range sets in
.
Definition 2 Given a pair
, and
, and
a finite subset of
. A subset
is called an
-net (resp.
-approximation) for
if
for all
with
(resp. if
for all
).
The important result is (which we state in terms of finite sets or equivalently for atomic measures than for more general measure spaces) is the following. The VC dimension plays of the range
of course plays a critical role in it.
Theorem 3 If
, then there exists
, such that for every
, and finite subset
of
, there exists an
-net (resp. an
-approximation) for
of cardinality at most
(resp.
).
Haussler, David; Welzl, Emo
-nets and simplex range queries. Discrete Comput. Geom. 2 (1987), no. 2, 127-151.
Komlos, Janos; Pach, Janos; Woeginger, Gerhard Almost tight bounds for
-nets. Discrete Comput. Geom. 7 (1992), no. 2, 163-173.
The key point of course is the fact that the cardinality of the
-net (resp.
-approximations) can be bounded independent of the cardinality of
. Thus, with respect to the “range space”
, any finite set
can be replaced by an “approximation” (i.e. an
-net or an
-approximation) of size that depends on
and the pair
but independent of the cardinality of
. This is what makes the above theorem key to many results in computational geometry. Theorem 3 also has an effective (algorithmic) variant. In turns out that there is a highly efficient randomized algorithm that can compute an
-net with high probability — and together these lead to efficient solutions to the range searching problem.
4. Application in model theory
In model theory the notion of VC-density is connected with the so called “independence property” of formulas and models, which in turn is related to the property of of stability which is of fundamental importance in model theory.
Mathematical logic and model theory has its own set of notation and definitions — some of which may not be completely familiar to non-logicians. In the following paragraphs I condense in half-a-page what usually takes a couple of lectures and several pages worth of writing to write “properly” — hopefully with no loss of content. I assume familiarity with the definition of a “first-order formula” in some language but not much more.
Given a language
(i.e. a tuple of symbols representing a set, constants, functions, and relations) an
-structure
is a tuple consisting of a set,
along with a set of constants, functions and relations on
interpreting the corresponding symbols of
. So if
(also referred to as the language of rings with a unit element), then the tuple
with the usual interpretations of the symbols “
”, “
”, “
”, “
” is an
-structure.
If
, then is

is an example of an
-formula having
as its set of free variables. An
-formula with an empty set of free variables is an
-sentence. (The reader can now come up with the proper definitions of
-formulas and sentences.)
A theory
is a set of
-sentences. An
-structure
is a model of
(denoted
) if every sentence in
is “true” in the structure
(the reader can supply what it means to be be “true” in
). A theory is called consistent if it admits a model. A theory
is complete if every for every
-sentence
, either
for every model
of
, or
for every model
of
.
Quantifier elimination
Notice that a formula can have quantifiers. If every formula is equivalent
to a formula without quantifiers then we say that the theory
admits quantifier elimination. (Two formulas
are equivalent
iff
or equivalently iff
for every model
of
.) It is a classical result in algebraic geometry (often referred to in non-model theory context as the Lefschetz principle) that the theory of algebraically closed fields admits quantifier elimination. The theory of real closed ordered fields admits quantifier elimination (in the language of
of ordered fields) — this was proved by Tarski. Th property of quantifier elimination depends on the language. So the theory of real closed fields (without the order relation) does not admit quantifier elimination. Why ?
Types
One notion that we will not discuss much, but which plays a key role in the application that we describe in Section 8, is that of types. Given an
-structure
, a subset
, and
, an
-type
over
is a consistent set of
-ary
-formulas with parameters in
(this is the same as saying that there exists an elementary extension
of
and
such that
for every
.) An
-type over
is called complete if for every
-ary formula
with parameters in
, either
or
. The set of
-types over
is denoted
. The type space
carry a natural topology referred to as the Stone topology.
Types in model theory are a device for producing a topological space out of (a model of) a theory
, analogous to how algebraic geometers produce the topological space
(equipped with the Zariski topology) out of a commutative ring
. In fact it is more than an abstract analogy. If
is a model of the theory of algebraically closed fields, then the space of complete
-types,
, is indeed in bijection with
(it is a nice do-able exercise to come up with this bijection just from what has been said above). Moreover, the natural bijective map
is continuous (with the Stone topology on the left and the Zariski topology on the right). Its inverse though is not continuous (the Stone topology is much finer than the Zariski topology).
Let us now fix a language
, and a
-theory
, and let
be a model of
. A subset
of
is said to be definable if there exists a
-formula
with
, such that
(in model theory lingo it is usual to write in this case
). We also say
is definable with parameters if there exists a formula
, such that there exists
such that
. The set of subsets
is a definable family of subsets of
.
(One final piece of warning about notational abuse. Model theory textbooks often drop the exponents in formulas like
and just write
even when
. This is harmless and does not cause any ambiguity.)
O-minimal structures
One class of structures that is playing an increasing important role in applications of model theory to other areas of mathematics (Diophantine geometry, Hodge theory amongst others), and which also plays an important role in the “VC story” to which this post is devoted, is the class of o-minimal structures.
Let
be a language having a binary relation symbol
. Then, an
-structure
is said to be o-minimal if the reduct
is a model of the theory of dense linear order, and the set of definable subsets (with parameters) of
is the smallest possible — namely, finite unions of points and intervals (such sets are clearly definable with parameters in the ordered structure
). This explains the name “order-minimal” or o-minimal. The most interesting examples of o-minimal structures are those that expand the o-minimal structure
where
is a real closed field. Gabrielov proved that there exist strict expansions of
(as his undergraduate thesis work in Moscow State University long before the name “o-minimal” was invented!), and many other structures have been shown to be o-minimal since then. The terminology of “o-minimality” was introduced by Pillay and Steinhorn.
Gabrielov, A. M. Projections of semianalytic sets. (Russian) Funkcional. Anal. i Priložen. 2 1968 no. 4, 18-30.
Pillay, Anand; Steinhorn, Charles Definable sets in ordered structures. Bull. Amer. Math. Soc. (N.S.) 11 (1984), no. 1, 159-162.
The definable sets in an o-minimal structure share many of the tameness properties of real semi-algebraic sets (uniform finiteness in families of topological invariants such as the number of connected components or the total Betti number, local conical structure etc.). As such they provide a natural setting for studying geometric problems without having to worry about wild oscillatory functions, for example
over
.
A standard source book for o-minimal structures is:
van den Dries, Lou. Tame topology and o-minimal structures. London Mathematical Society Lecture Note Series, 248. Cambridge University Press, Cambridge, 1998. x+180 pp. ISBN: 0-521-59838-9
Stability and independence
We now discuss the (related) notions of stability and independence of theories. We start with the former since stability is the more important of the two notions (but we will soon see that it is independence that has to do with VC theory).
Let
be a
-formula (here
are tuples of free variables). The formula
is said to be stable if there does NOT exist sequences
in some model of
of
, satisfying
if and only if
. The theory
is said to be stable if every
-formula
is stable. In other words,
is stable if it is not possible to order infinite sequences using a formula (modulo
) in the above sense. The theory of real closed fields is not stable — take
(which expresses
), and let
. On the other hand the theory of algebraically closed fields is stable. The stability property of theories has several equivalent definitions. For example, stability can be defined in terms of cardinalities of type spaces (“stable theories have few types”). These different definitions seem to have nothing to do with each other on the surface (the hallmark of most important notions in mathematics).
We next define the notion of independence. We say that a formula
has the independence property if there exists a sequence
in some model of
of
, such that for all
, the set of formulas
is consistent. A theory is said to “not have the independence property” (NIP for short) if no
-formula
has the independence property. If a theory has the independence property then the theory cannot be stable (Exercise !). However, the converse is not true — NIP theories need not be stable. For example, the theory of real closed fields is NIP but not stable. The following theorem characterizes the gap between being NIP and stable. We say that a formula
has the strict order property if there exists a sequence
in some model
of
, such that
if and only if
.
Theorem 4 (Shelah) A theory is not stable if and only if there exists a formula
having the independence property or the strict order property.
A good source for NIP theories is the book:
Simon, Pierre A guide to NIP theories. Lecture Notes in Logic, 44. Association for Symbolic Logic, Chicago, IL; Cambridge Scientific Publishers, Cambridge, 2015. vii+156 pp. ISBN: 978-1-107-05775-3
NIP and finiteness of VC density
Finally, coming back to VC-density — it is an easy exercise to see that if
is a model of
, and
a
-formula — then
not having the independence property is equivalent to the statement that the definable family
of subsets of
has finite VC-density. In particular, any definable family of sets in an NIP structure has finite VC dimension/density.
5. VC-density and
-patterns
For any set
and a finite set of subsets
of
, the set of
-patterns realized by
is the subset
of
defined by
![\displaystyle \Sigma(V,\mathcal{C}) = \{\sigma \in \{0,1\}^{[1,n]} \mid \exists v \in V \mbox{ such that } \sigma(i) = \chi_{V_i}(v) \},](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5CSigma%28V%2C%5Cmathcal%7BC%7D%29+%3D+%5C%7B%5Csigma+%5Cin+%5C%7B0%2C1%5C%7D%5E%7B%5B1%2Cn%5D%7D+%5Cmid+%5Cexists+v+%5Cin+V+%5Cmbox%7B+such+that+%7D+%5Csigma%28i%29+%3D+%5Cchi_%7BV_i%7D%28v%29+%5C%7D%2C+&bg=ffffff&fg=000000&s=2&c=20201002)
where
is the ordinary characteristic function of the set
.
Let
be sets and
, and let
be the two projections. We will abuse notation slightly and denote for
(resp.
)
(resp.
) by
(resp.
). Let
and
.
It is an easy exercise now to check that for
,

Denoting

we have

In particular, upper bounds on the cardinalities of the set of
-patterns realized by finite subsets of
, gives upper bound on the VC density of the family
. The VC density of the definable family
is then called the VC codensity of the definable family
.
Example 1 Let
, and
. Then,
is the set of all closed unit disks in the
, and
the set of all closed unit disks in
.
Proving tight upper bounds on the VC codensities of (or equivalently
-patterns realized by) definable families in various structures has attracted a lot of attention of both combinatorialists and model theorists. In many combinatorial applications, upper bounds on the number of
-patterns or sign patterns if the underlying field is real give upper bounds on the number of interesting objects — classical applications of such methods include proving upper bounds on the number of realizable oriented matroids, combinatorial types of polytopes etc.
The very first application of this type that I am aware of is:
Goodman, Jacob E.; Pollack, Richard. There are asymptotically far fewer polytopes than we thought. Bull. Amer. Math. Soc. (N.S.) 14 (1986), no. 1, 127-129.
More recently, results in incidence combinatorics (such as the Szemeredi-Trotter theorem and its various generalizations) which have been traditionally studied over the real and complex fields, have being generalized to more general model-theoretic structures. Upper bounds on VC density/codensity often play a key role in such questions.
For the rest of this post we will now switch to the more geometric language of
-patterns and explain how techniques from topology and algebraic geometry help to prove tight upper bounds on the number of
-patterns.
6. Bounding the number of
-patterns: what has cohomology to do with it ?
It is instructive to consider the following continuation of Example 1.
To illustrate the main idea consider the problem of bounding the number of
-patterns realized by
closed unit disks in
(cf. Example 1), assuming that the boundaries of the disks intersect transversally.

It is not difficult to see that the number of
-patterns in this case is bounded by the number of connected components of the complement of the union of the boundaries of these disks.

So it suffices to bound this number. We make another simplification for exposition. It makes no difference topologically since all the boundaries are compact to assume that the arrangement of
circles is on the sphere
rather than on
(by taking a one-point compactification of the plane if you wish).
Let the
unit circles be denoted
. Using Alexander duality,

where I am denoting by
the dimension of the
-th cohomology group with rational coefficients. To bound
we make use of a spectral sequence argument. We consider the Mayer-Vietoris spectral sequence (more precisely the spectral sequence of the row-wise filtration of the Čech complex) of the covering
of the union
:
![\displaystyle E_r^{p,q} \Rightarrow \mathrm{H}^{p+q}(\bigcup_{i \in [1,n]} S_i).](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+E_r%5E%7Bp%2Cq%7D+%5CRightarrow+%5Cmathrm%7BH%7D%5E%7Bp%2Bq%7D%28%5Cbigcup_%7Bi+%5Cin+%5B1%2Cn%5D%7D+S_i%29.+&bg=ffffff&fg=000000&s=2&c=20201002)
The
-page of this spectral sequence is given by
![\displaystyle E_1^{p,q} = \bigoplus_{I \subset_{p+1} [1,n]} \mathrm{H}^q(S_I),](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+E_1%5E%7Bp%2Cq%7D+%3D+%5Cbigoplus_%7BI+%5Csubset_%7Bp%2B1%7D+%5B1%2Cn%5D%7D+%5Cmathrm%7BH%7D%5Eq%28S_I%29%2C+&bg=ffffff&fg=000000&s=2&c=20201002)
where for
,

Observe that since in any spectral sequence,

we have for every 
![\displaystyle \dim \mathrm{H}^k(\bigcup_{i \in [1,n]} S_i) = \sum_{p+q=k} \dim E_{\infty}^{p,q} \leq \sum_{p+q=k} \dim E_{1}^{p,q}.](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cdim+%5Cmathrm%7BH%7D%5Ek%28%5Cbigcup_%7Bi+%5Cin+%5B1%2Cn%5D%7D+S_i%29+%3D+%5Csum_%7Bp%2Bq%3Dk%7D+%5Cdim+E_%7B%5Cinfty%7D%5E%7Bp%2Cq%7D+%5Cleq+%5Csum_%7Bp%2Bq%3Dk%7D+%5Cdim+E_%7B1%7D%5E%7Bp%2Cq%7D.+&bg=ffffff&fg=000000&s=2&c=20201002)

In partiular, taking
,

it suffices to bound the dimensions of the groups
and
separately. Now,
![\displaystyle E_1^{0,1} = \bigoplus_{i \in [1,n]} \mathrm{H}^1(S_i).](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%C2%A0E_1%5E%7B0%2C1%7D+%3D+%5Cbigoplus_%7Bi+%5Cin+%5B1%2Cn%5D%7D+%5Cmathrm%7BH%7D%5E1%28S_i%29.+&bg=ffffff&fg=000000&s=2&c=20201002)
In our case
for each
, and so
,
Also,
, and in our case
for each pair
and so

Putting everything together we obtain

This implies that the number of
-patterns realized by
closed unit disks (whose boundaries intersect transversally) is bounded by
.
The main things to observe about the above argument are the following.
- The bound on the
-terms of the spectral sequence is a product of two quantities, namely —
- A binomial coeficient counting the number of summands in the direct sum of case. (This number grows with
and has a polynomial dependence. The maximum degree of this polynomial amongst the various terms (
in our case) provides an upper bound on the VC codensity (of the family
).)
- An upper bound on a certain Betti number of a set obtained as the intersection of a bounded number (the bound does not grow with
) of sets obtained in some way from the original sets (taking boundaries in this instance). (There is an absolute upper bound on this part which is independent of
.)
- Secondly, what we actually ended up bounding is the sum of zero-th Betti number (i.e. the number of connected components) of the realizations of all the
-patterns realized by the set
. If we denote for

we have proved:
![\displaystyle \sum_{\sigma \in \{0,1\}^{[1,n]}} b_0(\mathcal{R}(\sigma)) \leq n^2+1. \ \ \ \ \ (2)](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Csum_%7B%5Csigma+%5Cin+%5C%7B0%2C1%5C%7D%5E%7B%5B1%2Cn%5D%7D%7D+b_0%28%5Cmathcal%7BR%7D%28%5Csigma%29%29+%5Cleq+n%5E2%2B1.+%5C+%5C+%5C+%5C+%5C+%282%29&bg=ffffff&fg=000000&s=2&c=20201002)
On other hand, since
, we have the inequality,
![\displaystyle \mathrm{card}(\Sigma(\mathbb{R}^2,\mathcal{D})) \leq \sum_{\sigma \in \{0,1\}^{[1,n]}} b_0(\mathcal{R}(\sigma)). \ \ \ \ \ (3)](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cmathrm%7Bcard%7D%28%5CSigma%28%5Cmathbb%7BR%7D%5E2%2C%5Cmathcal%7BD%7D%29%29+%5Cleq+%5Csum_%7B%5Csigma+%5Cin+%5C%7B0%2C1%5C%7D%5E%7B%5B1%2Cn%5D%7D%7D+b_0%28%5Cmathcal%7BR%7D%28%5Csigma%29%29.+%5C+%5C+%5C+%5C+%5C+%283%29&bg=ffffff&fg=000000&s=2&c=20201002)
Together, (2) and (3) imply that

- Finally, a key role (which is hidden by our use of Alexander duality) is played by the fact that the cohomology groups of semi-algebraic subsets of
vanishes in dimensions
and higher.The “spectral sequence argument” sketched above with certain extensions give a powerful method for obtaining tight bounds on VC codensities. In our example, we made some simplifications to make the exposition simpler. For example, we assumed that the boundaries of the disks intersected transversally. If this was not the case, and the circles were allowed to touch each other, then the argument would fail (the realizations of the
-patterns will not be bounded from above by the number of connected components of the complement of the union of circles. In this case one needs to replace the union of the boundaries by a more complicated system of sets (involving infinitesimal tubes of various widths around the sets of the given family). Furthermore, if the given family was not in
, but in some other semi-algebraic set not necessarily non-singular, then we would not be able to use Alexander duality. So there are many technical issues that I skipped but which needs to be dealt with in order to make the argument general. These can be found in:
Basu, Saugata; Pollack, Richard; Roy, Marie-Francoise On the Betti numbers of sign conditions. Proc. Amer. Math. Soc. 133 (2005), no. 4, 965-974.
7. Some History
As mentioned earlier, the problem of obtaining tight bounds on the VC-codensity (or equivalently, a bound on the number of
-patterns) for definable families in an NIP structure has attracted a lot of attention over the years.
For definable families of algebraic hypersurfaces in
of fixed degree over a field
, Babai, Ronyai, and Ganapathy in
Ronyai, Lajos; Babai, Laszlo; Ganapathy, Murali K. On the number of zero-patterns of a sequence of polynomials. J. Amer. Math. Soc. 14 (2001), no. 3, 717-735.
used a “linear algebra argument” to prove that VC-codensity of such families is bounded by
. This bound is easily seen to be optimal.
For families of hypersurfaces (defined by polynomials of bounded degrees) on any real variety of
, it is shown in
Basu, Saugata; Pollack, Richard; Roy, Marie-Francoise. On the Betti numbers of sign conditions. Proc. Amer. Math. Soc. 133 (2005), no. 4, 965-974.
using the topological method outlined in Section 6 that the VC-codensity is bounded by
.
Finally, it was shown in
Basu, Saugata. Combinatorial complexity in o-minimal geometry. Proc. Lond. Math. Soc. (3) 100 (2010), no. 2, 405-428.
that the VC codensity of any definable family of subsets of a definable set
in an arbitary o-minimal expansion of a real closed field is bounded by
.
(Note that some of the papers listed above prove more refined and general upper bounds on the Betti numbers of the realizations of
– patterns or sign patterns — where there is also explicit dependence on the family (such as on the degrees of the polynomials defining the hypersurfaces). The bound on the VC codensity is extracted from the exponent of “
” in the bound on the zero-th Betti number as in Section 6.)
8. VC codensity bounds over valued fields
A class of structures that is important in model theory (and indeed other areas of mathematics, such as number theory) are valued fields. We are mostly interested in fields
, with a non-archimedean valuation
, where
is an ordered group. Tha valuation
is assumed to be multiplicative and satisfy the ultrametric inequality — namely,
.
(Note that we are following model theory convention and writing the valuation multiplicatively instead of additively which is more standard).
The most familiar example of a valued field (with a non-archimedean valuation) is the field
with the
-adic valuation,
for some prime number
. Writing a non-zero rational number 

the
-adic valuation is given by

Given a valued field
, its valuation ring
is defined as
. The valuation ring
has a unique maxmal ideal
, and the quotient field
is called the residue field of
. The residue field of
is isomorphic to
. The pair
is called the characteristic pair of
. The characteristic pair of a valued field can be
, or
where
is a prime. The characteristic pair of
is clearly
. The fields of Puiseux series
with coefficients in a field
, furnishes examples of valued fields with characteristic pairs
or
depending on the characteristic of
.
From the model theory perspective, the structure of a valued field is an example of a two-sorted structure (as opposed to the one-sorted structures defined in Section 4). This means that there are two sorts of variables — of the field sort and the value group sort. The language of valued fields is correspondingly two sorted and is the tuple
where the subscripts
refer to the two sorts — field and value group respectively. It is a non-trivial theorem that the two-sorted theory of algebraically closed valued fields admit quantifier elimination (in the corresponding two-sorted language). Thus if
is a model of the theory algebraically closed valued fields, then every definable subset of
can be described by a quantifier-free formula, whose atoms are of the form
, where
and
.
A good source for the model theory of algebraically closed valued fields (and also for what comes next) is:
Haskell, Deirdre; Hrushovski, Ehud; Macpherson, Dugald. Stable domination and independence in algebraically closed valued fields. Lecture Notes in Logic, 30. Association for Symbolic Logic, Chicago, IL; Cambridge University Press, Cambridge, 2008. xii+182 pp. ISBN: 978-0-521-88981-0
If
is an algebraically closed valued field (for example
), then the problem of obtaining tight bounds on the VC-codensity for definable families in
was considered in:
Aschenbrenner, Matthias; Dolich, Alf; Haskell, Deirdre; Macpherson, Dugald; Starchenko, Sergei Vapnik-Chervonenkis density in some theories without the independence property, I. Trans. Amer. Math. Soc. 368 (2016), no. 8, 5889-5949.
In particular, they prove a bound of
for the VC-codensity in the special case when the characteristic pair of
is
.
If one contrasts this result with those in Section 7 above, one sees that the corresponding bound for the theories of algebraically closed and real closed fields, and also for any o-minimal theory, is
rather than
. In fact, the bound in those cases equals the “cohomological dimension” (appropriately defined) of the ambient definable set for the definable family being considered.
The method used in the paper of Aschenbrenner et. al. is algebraic and model theoretic. It is natural to consider a more topological approach. The first obstacle is that definable subsets of a valued field might not carry a nice enough topology on which the cohomological method described in Section 6 can be applied. For various reasons (non-vanishing in dimensions bigger than the dimension of the ambient variety being one of them) ordinary sheaf cohomology with respect to the Zariski or etale site for schemes are not suitable.
In order to get around this difficulty we first need to take a detour.
Given an affine variety
defined over a complete valued field, we will embed
and all definable subsets of
, in a larger topological space having better topological properties than
itself. Embedding into a larger space can only increase the number of realizable
-patterns — and thus an upper bound on the number of
-patterns realized in the larger space is still an upper bound on the corresponding number of the original family. The technical tool to achieve this is also known as Berkovich analytification introduced by Berkovich.
Berkovich analytification
Suppose that
is a complete valued field with a non-trivial non-archimedean valuation.
Let
be an integral domain. A multiplicative, non-archimedean seminorm on
is a map
satisfying,
, and
, and the ultrametric inequality,
.
Now, let
be an affine
-variety. Berkovich associates to
a space
in the following way. The points of
are multiplicative semi-norms
, whose restriction to
equals the given valuation on
(semi-norms on fields are valuations as defined previously). For each
, let
denote the corresponding semi-norm. The topology on
is the weakest topology that makes all the evaluation maps,

continuous. Moreover, (this is key for our application), for each closed point
, the function

(noticing that
) is a semi-norm, and this gives an embedding of the closed points of
into
.
The Berkovich analytic space
has nice topological properties (it is locally compact for example), but it is not so easy to visualize. The Berkovich analytification of the projective line
for example is homeomorphic to an infinite rooted tree. Moreover, for application to bounding VC codensity one has to deal with not just varieties, but definable subsets of these varieties as well.
Berkovich, Vladimir G. Spectral theory and analytic geometry over non-Archimedean fields. Mathematical Surveys and Monographs, 33. American Mathematical Society, Providence, RI, 1990. x+169 pp. ISBN: 0-8218-1534-2
Hrushovski-Loeser
Fortunately, the recent break-through results of Hrushovski and Loeser gives us to get around the problems mentioned above. Hrushovski and Loeser give an alternative definition of
as spaces of certain kind types. (As has been mentioned before, the model theorists’ method of producing spaces from rings is to define them as spaces of types — cf. discussion relating the
of a polynomial ring in
variables with the space of complete
-types in Section 4 above). Moreover, associated to any quantifier-free formula with atoms of the form
, one has a corresponding semi-algebraic subset of
(thought of now as a space of types). Astoundingly, Hrushovksi and Loeser prove certain key topological tameness properties of these “semi-algebraic sets” which are entirely analogous to those known in the case of o-minimal structures. Moreover, they prove that these “semi-algebraic” subsets of
are homotopy equivalent to finite simplicial complexes of dimension at most
. Therefore, their cohomological dimension is at most
.
Hrushovski, Ehud; Loeser, Francois. Non-archimedean tame topology and stably dominated types. Annals of Mathematics Studies, 192. Princeton University Press, Princeton, NJ, 2016. vii+216 pp. ISBN: 978-0-691-16169-3; 978-0-691-16168-6
Using the results of Hrushovski and Loeser, in particular the vanishing of cohomology in dimensions greater than
, Deepam Patel and I, were able to “import” the o-minimal techniques for proving upper bounds on Betti numbers of realizations of
-patterns to the valued field case. Available here.
We prove the following theorem (stated a bit informally to avoid introducing more notation).
Theorem 5 Let
be a algebraically closed complete non-archimedean valued field,
be an affine
-variety and
a fixed definable family of semi-algebraic subsets of
. Then, there exists
, such that for each
, and for every finite subset
of elements of the family 
![\displaystyle \sum_{\sigma \in \{0,1\}^{[1,n]}} b_i(\mathcal{R}(\sigma)) \leq c \cdot n^{\dim V -i},](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Csum_%7B%5Csigma+%5Cin+%5C%7B0%2C1%5C%7D%5E%7B%5B1%2Cn%5D%7D%7D+b_i%28%5Cmathcal%7BR%7D%28%5Csigma%29%29+%5Cleq+c+%5Ccdot+n%5E%7B%5Cdim+V+-i%7D%2C+&bg=ffffff&fg=000000&s=2&c=20201002)
where

As an corollary (taking as usual
) in the above bound we obtain (using the same notation as in Theorem 5(:
Corollary 6 The VC codensity of any definable family of definable subsets of
is bounded by
.
Corollary 6 removes the multiplicative factor of
from the upper bounds proved by Aschenbrenner et. al. mentioned before. Also, there is no restriction on the characteristic pair of
(thanks to the generality of the results by Hrushovski and Loeser).
Archimedean vs non-archimedean
It is a natural question to ask about what happens if we consider subsets defined by inequalities
when the
is the archimedean norm — for example, when
, and
denotes the modulus. The answer is that for a definable family in
defined by a formula having atoms of the form
, the VC codensity can be as large as
. An easy example is the following. Consider the definable family,
, of unit disks in
defined by
, as
varies over
. Given
unit disks in the plane, it is easy to check that the number of
-patterns can grow quadratically — so the VC codensity of
is
. What happens to the family defined by the same formula in the non-archimedean case ? In this case, because of the ultra-metric inequality, two disks are either disjoint or equal. Hence the number of
-patterns is bounded by
, and the VC codensity is
.
Epilogue
I hope I have been able to convey the breadth of mathematical topics that the notion of VC dimensions has connections to. It remains an extremely active topic. For example, here are two very recent combinatorial papers from the arXiv in which VC dimension plays a critical role.
Bounded VC-dimension implies the Schur-Erdos conjecture, Jacob Fox, Janos Pach, Andrew Suk
An Efficient Algorithm for Generalized Polynomial Partitioning and Its Applications. Pankaj K. Agarwal, Boris Aronov, Esther Ezra, Joshua Zahl
There are many open questions — some directly related and some unrelated to the the topics discussed above.
One immediate open problem would be to extend the “cohomological method” to the
-adics. The problem here is that the theory of
admits quantifier elimination only in an extended language (using the so called Macintyre’s predicates for example)– and thus not all definable subsets of
can be defined by quantifier-free formulas with atoms of the form
.
A more open ended problem (to which I will devote another post) is to “integrate” cohomology more directly inside the model theoretic framework. Some idea of what I have in mind can be found here (joint work with Deepam Patel).