Suppose that we are given information about the distribution of n variables in the form of distributions of some subsets of those variables. Let the subsets be denoted
, subsets of
. For example let n=3 and let
be given with
, and
, indicating that we are given
and similarly
. Call this information K, for known. Now infer the distribution of the n variables given this information, and call this inference
. The inferred uncertainty of the full distribution is then given by
. Because
is consistent with K we have for each i the marginal of
over the variables not in
is one of the given distributions,
. Doing some algebra, for two sets A,B,

Now, note that
. Then it follows that
(all conditioned on K). Note that equality holds iff
is independent of
when conditioned on
.
This can be extended to more than two subsets by noting that the algebra with the equality holding is the same as that for counting set elements upon intersection or union, i.e. additive. The approximate result is then
![]()
The connection to the information correlation functions
can be made by taking the
. Then the right side of equation 3.33 is just
, so that
equals how much the expression on the right side mis-estimates the entropy of all of the variables. (In general, the sign of
may be positive or negative. Take
to make everything consistent.) This observation also allows us to generalize the notion of the information correlation function to include those functions defined similarly when any set of indicator sets is given, that is
![]()