0:00

One of the reasons for the increasing popularity of map inference is that there

is a large subclass of map problems that actually allow tractable inference.

Now we've already see one class of map problems that allows tractable inference,

which is the graphical models that have low tree width, for which you can apply

variable elimination or clique trees. But it turns out that besides that,

there's actually a rich set of other sub-classes that can be solved using

efficient algorithms, often based on ideas from computer science theory.

So let's look at some of these examples and see how we might go about solving

them. Once such important class of problems is

that of correspondence, or data association problems.

Here we have two classes of objects. In this case the red object on the one

side, and the blue objects in the other. And we'ld like to find a correspondence,

or matching between them. And in fact it's the matching means that

each that, that the assignment has to be one to one.

So you can't have a single red dot match more than one blue dot, or vice-versa.

So here for example, is one. Such matching.

And we may additionally require that all the red dots be matched or all the blue

dots be matched, which can't be done in this case.

So what is the quality of a matching given the the set of possible matching,

so how do we select between them. So we have in the model a notion of the

quality of the match between I and J which we donate in this case as the

parameter theta I J. And so our goal in this in this problem

is to find the highest scoring matching where the score is defined as the sum of.

The matches multiplied by the quality of their score, and we constrain the

variables X I J, these binary variables that indicate whether I is mapped to J to

via the mutual exclusion constraint. That we discussed earlier.

Now, this type of problem is easily solved by using bipartite matching

problems developed in computer science theory.

And, it turns out that, in the context of graphical models.

or probabilistic inference. This type of problem has numerous

applications. So, for example, we might have, on the

one hand, say, the the red side. A set of sensor readings from a number

of, say, airplanes, that we're trying to keep track of.

And on the blue side, we might have a set, a set of airplanes or objects that

we've previously detected. And we'd like to figure out which sensor

reading corresponds to which object. Or we might have features in two

unrelated images. And we'd like to figure out, which

feature. Image one matches which feature in image

two. Or if we're trying to do some kind of

text analysis we might have a set of entities.

Say people or organizations that we're trying to reason about, or learn

something about. And we're trying to match mentions in the

text to those entities. Subjects in the constraint that each

mention correspond to exactly one entity. So lets look at one example application

of this which corresponds to one of the examples that we showed on the

[INAUDIBLE], mentioned on the previous slide which is that of 3D cell

reconstruction. In this case we have over here a cell

that we are trying to assess with 3D structure and we are looking at various

slices through that cell taken from different angles via microscope.

And so these are various two dimensional projections of the 3D object and we are

trying to figure out from that three dimensional structure.

Now within the cell we might have these little tiny gold beads that were put into

the cell so that we can somehow register the different images of the cell to each

other. So, here these are examples of two such

real images and these are the little gold beads you can see them in the little

circles and what we would like to do is we would like to.

Decide that this beat over here is the same as that beat over there.

And once we have that correspondence, we can then compute the 3D reconstruction

using some volumetric reconstruction algorithms.

And in this case the matching weights; those data IJ's that we saw on the

previous slide, cor, represent the similarity between this position.

I mean this dot and this dot. In terms of the location, in the image.

And in terms of the neighborhood appearance in in the vicinity of that

point. A very different application for what is

effectively the same kind of idea is if we have a 3-D mesh scan of a, of an

object such as a human being. And we'd like to figure out how points on

this mesh match up to points on a somewhat related but different mesh.

Whether it's different pose or different shape.

So here we would like to realize that this point over here maps to this point

over here. Or that this point on this mesh maps to

this point on this mesh. And once again these matching weights,

the Vetta IJ represent exactly the similarity of both location and, again

the local neighborhood appearance. Another very commonly used class of

retractable math sub problems are those that involve what are called associative

potential. Associative also called regular in some

cases or attractive are cases where the variables that are connected to each

other like to take the same values. So let's consider this example on the

context of binary value tangen variables where we have a variable XI.

And the variable XJ that are connected to each other and we have the on diagonal

potentials of the zero, zero and the one, one which

are the cases where XI and XJ take the same value.

And we have the off diagonal potentials zero one, one zero where the variables

take on different values and what we would like in an associative model is for

the zero, zero and the one, one cases to be preferred over the off diagonal cases.

And that can be formulated using the following constraint in which we have

that A plus D which are the on diagonal entries are in sum greater than or equal

to B plus C. So that in aggregate we prefer the on

diagonal to the off diagonal entries. It turns out that.

If you have even an arbitrary network in terms of fully dense clinic tivity over

binary variables, that use only the Singleton potentials, and, these,

pairwise potentials that are associative. They're also called super modular, which

is a term that comes from mathematics. if the pairwise potential satisfy this

constraint that we saw over here, then you can find an exact solution regardless

of the network connectivity. And that can be done using algorithms

from communitorial optimization from computer science theory for finding

minimum cuts in graphs. For non-binary valued random variables

exact solutions are no longer possible, but it turns out that very simple

heuristics can get very, very high quality answers very efficiently.

and so many related variance that are not necessarily binary admit either exact or

approximate solutions and specifically we talked before about the class of metric

MRFs that occur a lot in computer vision as well as other applications.

And it turns out that for metric MRFs even over non-binary variables you can

you can use these very efficient algorithms.

So, one such example is in the context of depth reconstruction.

Where we have two views of the same object taken from slight disparity, say,

using in a, using a stereo camera. So here's view one and view two in the

same scene. You can see that they're slightly

different from each other and and if we can and we now have for a given

we're trying to infer the following image that tells us for each pixel in the image

what is the actual depth. That is the Z dimension.

And one of the constraints that we like to enforce is that because of smoothness

in the image, the z value for a pixel is similar to the z value of an adjacent

pixel. And that allows us to clean up a lot of

noise that might arise if we just try to infer the depth of individual pixels by

themselves. And there are many other such examples in

computer vision that are used quite commonly and use this kind of associative

potentials. Examples include, for example, denoising

of images infilling of missing regions, and a whole variety of other tasks,

including foreground, background segmentation, and so on.

. Another kind of factor that admits

efficient solutions is a cardinality factor.

And this is a factor that in general can involve arbitrarily many binary variables

X1 up to XK. But we're going to require the, that it's

sparsely representable in the sense that the score for an assignment that X1 up to

XK is a function of how many XI's are turned on.

So to take an example, here is a factor like this over A B C and D and you can

see that we have only five different values for entries in this factor.

We have this pink entry over here when the sum of the X Is, is zero.

We have this blue entry over here that occurs when the sum of the entries is

one. Now for these four entries you have the

purple one that occurs when we have two XI's on the green one occurs when we have

three and the dark blue a bit occurs when we have four.

Okay so there is only five different possible values here even though there's

potentially two to four different combination's.

And it turns out, that this actually occurs in, quite a number of different

application, when we talked about, parody constraints for example, in message

decoding, parody constraints are, something that only looks at the number

of XI's that are on, if you want to have digital image segmentation and you want

to have some kind of prior on the number of pixels in a given category, you want

to say that, in this image we expect the number of pixels, to be between you know,

that are labeled sky, to be between 30 and 70 percent.

This is potentially a factor over all pixels in the image.

But it's, but it only counts the total number of pixels.

And so again, it turns out to be efficient.

And there's many other examples as well. A different one that also turns out to be

very useful in practice is the notion of sparse sparse patterns.

So here again you could have an arbitrary number of variables X1 of the XK, but you

only specify score for some small number of assignments.

So here for example is a factor again over a, b, c and d and I have only now

three factor, only three entries in that factor that gets some values, you don't

have to get the same value, but only they get some value and all of the one's that

are white are all have exactly the same score.

Generally zero. And.

This is actually surprisingly useful in practice because it's very useful to give

a higher score to combinations that occur in real data.

So for example if we were doing a spelling problem you want to give a

higher probability to whatever letter combinations occur in a dictionary.

And that's a finite number, it's bounded by the size of the dictionary.

And now you have all the combinatorially many letter combinations that don't occur

in the dictionary. You don't have to give them all a

separate weight, and now again it turns out to be a sparse factor.

The same thing actually occurs in an image analysis problem where, for

example, you can look at all the five by five image patches that actually occur in

natural images; that's actually a surprisingly small set of patches

relative to all possible 25 pixel combinations.

And so, once again, this is a very commonly used factor over.

[INAUDIBLE] that arises on practice. a different factor that has tractable

properties is a convexity factor. Here we have [INAUDIBLE].

Go on up to XK that are denoted by these blue dots over here.

And and, but now they're ordered. So this is X1, X2 and XK.

And the convexity constraint says you can assign these values, these variables one

or zero. But you have to assign them in a way

that. Only a convex set is assigned the value

one. So, for example, you can have this set be

one, and everything else be zero. Or you can have this set be one, and

everything else be zero. But you can't sort of have a scattered

set of ones that are interspersed with the zeros.

And again, this occurs in a surprising number of applications.

You can look at parts in an image segmentation problem and say that the

parts should be convex. You can think about word labeling in text

and say that the words labeled with a certain part of speech or a certain

a certain segment that corresponds to a part in the parse tree, for example,

needs to be contiguous in the sequence if you're trying to label a video, with a

bunch of sub activities, you can say that each sub activity needs to have a

coherent span with a beginning and an end for example.

So to summarize, there is many specialized models that admit a map,

solution, using tractable algorithms. And, a surprising number of those do not

actually have tractable algorithms for computing marginals.

And these specialized models are useful in many cases on their own like the

matching problem that I showed. but also, as we'll see in a min, as we'll

see as a component, in a larger model with other types of factors.

And the fact that this, as a sub-model, has a tractable solution, will turn out

to be very useful as well.