Discussion:
Roadmap for features planed for containers where and Some future features ideas.
(too old to reply)
Peter Dolding
2008-07-21 11:03:47 UTC
Permalink
http://opensolaris.org/os/community/brandz/ I would like to see if
something equal to this is on the roadmap in particular. Being able
to run solaris and aix closed source binaries contained would be
useful.

Other useful feature is some way to share a single process between PID
containers as like a container bridge. For containers used for
desktop applications not having a single X11 server interfacing with
video card is a issue.

These container bridges avoid having to go threw network cards and
other means to share data between containers. A user space solution.

I know this reduces secuirty but when you need a application form X
distrobuton and you have Y distribution and its opengl heavy you are
kinda stuffed at moment.

Final one is some form of LSM processing different. Lot of the Linux
Secuirty channel talk about containers as light weight virtualisation
so will never need to run a OS inside with a different LSM profile to
the master OS. If containers plan to go after brandz like containers
this needs to be made clear that LSM different processing will be
required.

Peter Dolding
Eric W. Biederman
2008-07-21 12:13:27 UTC
Permalink
Post by Peter Dolding
http://opensolaris.org/os/community/brandz/ I would like to see if
something equal to this is on the roadmap in particular. Being able
to run solaris and aix closed source binaries contained would be
useful.
There have been projects to do this at various times on linux. Having
a namespace dedicated to a certain kind of application is no big deal.
Someone would need to care enough to test and implement it though.
Post by Peter Dolding
Other useful feature is some way to share a single process between PID
containers as like a container bridge. For containers used for
desktop applications not having a single X11 server interfacing with
video card is a issue.
X allows network connections, and I think unix domain sockets will work.
The latter I need to check on.

The pid namespace is well defined and no a task will not be able
to change it's pid namespace while running. That is nasty.
Post by Peter Dolding
These container bridges avoid having to go threw network cards and
other means to share data between containers. A user space solution.
There are lots of opportunities for user space solutions.
Post by Peter Dolding
I know this reduces secuirty but when you need a application form X
distrobuton and you have Y distribution and its opengl heavy you are
kinda stuffed at moment.
Final one is some form of LSM processing different. Lot of the Linux
Secuirty channel talk about containers as light weight virtualisation
so will never need to run a OS inside with a different LSM profile to
the master OS. If containers plan to go after brandz like containers
this needs to be made clear that LSM different processing will be
required.
We have had that discussion mostly this appears to be a measure of
matureness.

Eric
Peter Dolding
2008-07-21 13:21:35 UTC
Permalink
On Mon, Jul 21, 2008 at 10:13 PM, Eric W. Biederman
Post by Eric W. Biederman
Post by Peter Dolding
http://opensolaris.org/os/community/brandz/ I would like to see if
something equal to this is on the roadmap in particular. Being able
to run solaris and aix closed source binaries contained would be
useful.
There have been projects to do this at various times on linux. Having
a namespace dedicated to a certain kind of application is no big deal.
Someone would need to care enough to test and implement it though.
Post by Peter Dolding
Other useful feature is some way to share a single process between PID
containers as like a container bridge. For containers used for
desktop applications not having a single X11 server interfacing with
video card is a issue.
X allows network connections, and I think unix domain sockets will work.
The latter I need to check on.
Does to a point until you see that local X11 is using shared memory
for speed. Hardest issue is getting GLX working.
Post by Eric W. Biederman
The pid namespace is well defined and no a task will not be able
to change it's pid namespace while running. That is nasty.
Ok if that is imposable to extremely risky.

What about a form of a proxy pid in the pid namespace proxying
application chatter between 1 name space to another. Applications
being the bridge if its not possible to do it invisible to application
could be made aware of it. So they can provide shared memory and the
like across pid namespaces. But only where they have a activated proxy
to do there bidding. This also allows applications to maintain there
own internal secuirty between namespaces.

Ie application is 1 pid number in its source container and virtual pid
numbers in the following containers. Symbolic linking at task level
yes a little warped. Yes this will annoying mean a special set of
syscalls and a special set of capabilities and restrictions. Like PID
containers starting up forbidding proxy pid's or allowing them.

If I am thinking right that avoids not be able to change it's pid.
Instead sending and receiving the messages you need in the other name
space threw a small proxy. Yes I know that will cost some
performance.

Basically want to setup a neat universal container way of handling
stuff like http://www.cs.toronto.edu/~andreslc/xen-gl/ without having
to go network and hopefully in a way that limitations don't have to
exist since messages are really only be sent threw 1 X11 server to 1
driver system. Only thing is really sending the correct messages to
the correct place. There will most likely be other services were a
single entity at times is preferred. Worst out come is if proxying
.so is required.

Peter Dolding
Eric W. Biederman
2008-07-22 01:28:43 UTC
Permalink
Post by Peter Dolding
On Mon, Jul 21, 2008 at 10:13 PM, Eric W. Biederman
Post by Eric W. Biederman
Post by Peter Dolding
http://opensolaris.org/os/community/brandz/ I would like to see if
something equal to this is on the roadmap in particular. Being able
to run solaris and aix closed source binaries contained would be
useful.
There have been projects to do this at various times on linux. Having
a namespace dedicated to a certain kind of application is no big deal.
Someone would need to care enough to test and implement it though.
Post by Peter Dolding
Other useful feature is some way to share a single process between PID
containers as like a container bridge. For containers used for
desktop applications not having a single X11 server interfacing with
video card is a issue.
X allows network connections, and I think unix domain sockets will work.
The latter I need to check on.
Does to a point until you see that local X11 is using shared memory
for speed. Hardest issue is getting GLX working.
That is easier in general. Don't unshare the sysvipc namespace.
Or share the mount of /dev/shmem at least for the file X cares about.
Post by Peter Dolding
Post by Eric W. Biederman
The pid namespace is well defined and no a task will not be able
to change it's pid namespace while running. That is nasty.
Ok if that is imposable to extremely risky.
What about a form of a proxy pid in the pid namespace proxying
application chatter between 1 name space to another. Applications
being the bridge if its not possible to do it invisible to application
could be made aware of it. So they can provide shared memory and the
like across pid namespaces. But only where they have a activated proxy
to do there bidding. This also allows applications to maintain there
own internal secuirty between namespaces.
Ie application is 1 pid number in its source container and virtual pid
numbers in the following containers. Symbolic linking at task level
yes a little warped. Yes this will annoying mean a special set of
syscalls and a special set of capabilities and restrictions. Like PID
containers starting up forbidding proxy pid's or allowing them.
If I am thinking right that avoids not be able to change it's pid.
Instead sending and receiving the messages you need in the other name
space threw a small proxy. Yes I know that will cost some
performance.
Proxy pids don't actually do anything for you, unless you want to send
signals. Because all of the namespaces are distinct. So even at the
best of it you can see the X server but it still can't use your
network sockets or ipc shm.

Better is working out the details on how to manipulate multiple
sysvipc and network namespaces from a single application. Mostly
that is supported now by the objects there is just no easy way
of dealing with it.
Post by Peter Dolding
Basically want to setup a neat universal container way of handling
stuff like http://www.cs.toronto.edu/~andreslc/xen-gl/ without having
to go network and hopefully in a way that limitations don't have to
exist since messages are really only be sent threw 1 X11 server to 1
driver system. Only thing is really sending the correct messages to
the correct place. There will most likely be other services were a
single entity at times is preferred. Worst out come is if proxying
.so is required.
Yes. I agree that is essentially desirable. Given that I think
high end video card actually have multiple hardware contexts that
can be mapped into different user space processes there may be other
ways of handling this.

Ideally we can find a high performance solution to X that also gives
us good isolation and migration properties. Certainly something to talk
about tomorrow in the conference.

Eric
Oren Laadan
2008-07-22 14:05:27 UTC
Permalink
Post by Eric W. Biederman
Post by Peter Dolding
On Mon, Jul 21, 2008 at 10:13 PM, Eric W. Biederman
Post by Eric W. Biederman
Post by Peter Dolding
http://opensolaris.org/os/community/brandz/ I would like to see if
something equal to this is on the roadmap in particular. Being able
to run solaris and aix closed source binaries contained would be
useful.
There have been projects to do this at various times on linux. Having
a namespace dedicated to a certain kind of application is no big deal.
Someone would need to care enough to test and implement it though.
Post by Peter Dolding
Other useful feature is some way to share a single process between PID
containers as like a container bridge. For containers used for
desktop applications not having a single X11 server interfacing with
video card is a issue.
X allows network connections, and I think unix domain sockets will work.
The latter I need to check on.
Does to a point until you see that local X11 is using shared memory
for speed. Hardest issue is getting GLX working.
That is easier in general. Don't unshare the sysvipc namespace.
Or share the mount of /dev/shmem at least for the file X cares about.
Post by Peter Dolding
Post by Eric W. Biederman
The pid namespace is well defined and no a task will not be able
to change it's pid namespace while running. That is nasty.
Ok if that is imposable to extremely risky.
What about a form of a proxy pid in the pid namespace proxying
application chatter between 1 name space to another. Applications
being the bridge if its not possible to do it invisible to application
could be made aware of it. So they can provide shared memory and the
like across pid namespaces. But only where they have a activated proxy
to do there bidding. This also allows applications to maintain there
own internal secuirty between namespaces.
Ie application is 1 pid number in its source container and virtual pid
numbers in the following containers. Symbolic linking at task level
yes a little warped. Yes this will annoying mean a special set of
syscalls and a special set of capabilities and restrictions. Like PID
containers starting up forbidding proxy pid's or allowing them.
If I am thinking right that avoids not be able to change it's pid.
Instead sending and receiving the messages you need in the other name
space threw a small proxy. Yes I know that will cost some
performance.
Proxy pids don't actually do anything for you, unless you want to send
signals. Because all of the namespaces are distinct. So even at the
best of it you can see the X server but it still can't use your
network sockets or ipc shm.
Better is working out the details on how to manipulate multiple
sysvipc and network namespaces from a single application. Mostly
that is supported now by the objects there is just no easy way
of dealing with it.
Post by Peter Dolding
Basically want to setup a neat universal container way of handling
stuff like http://www.cs.toronto.edu/~andreslc/xen-gl/ without having
to go network and hopefully in a way that limitations don't have to
exist since messages are really only be sent threw 1 X11 server to 1
driver system. Only thing is really sending the correct messages to
the correct place. There will most likely be other services were a
single entity at times is preferred. Worst out come is if proxying
.so is required.
Yes. I agree that is essentially desirable. Given that I think
high end video card actually have multiple hardware contexts that
can be mapped into different user space processes there may be other
ways of handling this.
Ideally we can find a high performance solution to X that also gives
us good isolation and migration properties. Certainly something to talk
about tomorrow in the conference.
In particular, if you wish to share private resources of a container
between more than a single container, then you won't be able to use
checkpoint/restart on neither container (unless you make special
provisions in the code).

I agree with Eric that the way to handle this is via virtualization
as opposed to direct sharing. The same goes for other hardware, e.g.
in the context of a user desktop - /dev/rtc, sound, and so on. My
experience is that a proxy/virtualized device is what we probably
want.

Oren.
Post by Eric W. Biederman
Eric
_______________________________________________
Containers mailing list
https://lists.linux-foundation.org/mailman/listinfo/containers
Peter Dolding
2008-07-23 00:56:46 UTC
Permalink
Post by Oren Laadan
Post by Eric W. Biederman
Post by Peter Dolding
On Mon, Jul 21, 2008 at 10:13 PM, Eric W. Biederman
Post by Eric W. Biederman
Post by Peter Dolding
http://opensolaris.org/os/community/brandz/ I would like to see if
something equal to this is on the roadmap in particular. Being able
to run solaris and aix closed source binaries contained would be
useful.
There have been projects to do this at various times on linux. Having
a namespace dedicated to a certain kind of application is no big deal.
Someone would need to care enough to test and implement it though.
Post by Peter Dolding
Other useful feature is some way to share a single process between PID
containers as like a container bridge. For containers used for
desktop applications not having a single X11 server interfacing with
video card is a issue.
X allows network connections, and I think unix domain sockets will work.
The latter I need to check on.
Does to a point until you see that local X11 is using shared memory
for speed. Hardest issue is getting GLX working.
That is easier in general. Don't unshare the sysvipc namespace.
Or share the mount of /dev/shmem at least for the file X cares about.
Post by Peter Dolding
Post by Eric W. Biederman
The pid namespace is well defined and no a task will not be able
to change it's pid namespace while running. That is nasty.
Ok if that is imposable to extremely risky.
What about a form of a proxy pid in the pid namespace proxying
application chatter between 1 name space to another. Applications
being the bridge if its not possible to do it invisible to application
could be made aware of it. So they can provide shared memory and the
like across pid namespaces. But only where they have a activated proxy
to do there bidding. This also allows applications to maintain there
own internal secuirty between namespaces.
Ie application is 1 pid number in its source container and virtual pid
numbers in the following containers. Symbolic linking at task level
yes a little warped. Yes this will annoying mean a special set of
syscalls and a special set of capabilities and restrictions. Like PID
containers starting up forbidding proxy pid's or allowing them.
If I am thinking right that avoids not be able to change it's pid.
Instead sending and receiving the messages you need in the other name
space threw a small proxy. Yes I know that will cost some
performance.
Proxy pids don't actually do anything for you, unless you want to send
signals. Because all of the namespaces are distinct. So even at the
best of it you can see the X server but it still can't use your
network sockets or ipc shm.
Better is working out the details on how to manipulate multiple
sysvipc and network namespaces from a single application. Mostly
that is supported now by the objects there is just no easy way
of dealing with it.
Post by Peter Dolding
Basically want to setup a neat universal container way of handling
stuff like http://www.cs.toronto.edu/~andreslc/xen-gl/ without having
to go network and hopefully in a way that limitations don't have to
exist since messages are really only be sent threw 1 X11 server to 1
driver system. Only thing is really sending the correct messages to
the correct place. There will most likely be other services were a
single entity at times is preferred. Worst out come is if proxying
.so is required.
Yes. I agree that is essentially desirable. Given that I think
high end video card actually have multiple hardware contexts that
can be mapped into different user space processes there may be other
ways of handling this.
Ideally we can find a high performance solution to X that also gives
us good isolation and migration properties. Certainly something to talk
about tomorrow in the conference.
In particular, if you wish to share private resources of a container
between more than a single container, then you won't be able to use
checkpoint/restart on neither container (unless you make special
provisions in the code).
I agree with Eric that the way to handle this is via virtualization
as opposed to direct sharing. The same goes for other hardware, e.g.
in the context of a user desktop - /dev/rtc, sound, and so on. My
experience is that a proxy/virtualized device is what we probably
want.
Oren.
Giving up means to use checkpoint cleanly on containers independent of
each other when using X11 might be a requirement. Reason in GPU
processing if you want to provide that a lot GPU's don't have a good
segmented freeze its either park the full GPU or risk issues on
startup. Features need to be added to GPU so we can suspend
individual opengl context's to make that work. So any application
using the GPU at most likely will have to be lost in a checkpoint
restore independent to the other X11 using the desktop.
Even suspending the GPU as a block there are still issues with some cards.

Sorry Oren from using http://www.virtualgl.org I know suspending GPU's
is trouble.

http://www.cs.toronto.edu/~andreslc/xen-gl/ blocks out all usage of
GPU for advance processing effectively crippling card. Virtualized
basically is not going to cut it. You need access to GPU for
particular software to work.

This is more containers being used by desktop users to run many
distributions at once.

Of course there is nothing stopping checkpoint process informing user
that they cannot go past this point in check pointing until the
following application are closed. Ie the ones using the GPU shader
processing and the like. We just have to wait for video card makers
to provide us with something equal intels and amd's cpu vitalisation
instructions to suspend independent opengl context's.

Multiple hardware contexts are many independent gpu's stuck on cards
just like sticking more video cards in a computer yes they can be
suspended independently yes how they are allocated should be
controllable, These are not on every card out there. Yet you want
migration sorry really bad new here. A suspend of a gpu has to be
loaded backup on exactly the same type of GPU or you are stuffed. 2
different model cards will not work. So this does not help you at all
with migration or even worse video card death. Most people forget
that a suspend using compiz or anything else in gpu cannot be restored
if you have change video cards to a different gpu. Brand card does
not help you here.

Full X11 with Fully functional opengl will mean giving some things up.
Means to keep every application running threw a migration or
checkpoint is impossable. Applications container/suspend aware could
have some form of internal rebuild opengl context after restore from a
point they can restart there processing loop from but they will have
to redo all there shader code and other in gpu processing code in case
of change of gpu type and even there engine internal paths. This
alteration would allow check pointing and migration back with
dependability but only if using aware applications.

X11 2d can suspend and restore without major issue as
http://partiwm.org/wiki/xpra shows. 3d is a bugger.

There is basically no magical trick to get around this problem.
Containers alone cannot solve it. Rare section with loss has to be
excepted to make it work. By it working will be like Xen when it
started started cpu makers looking at making it better.

Restart should be a zero issue. Clearing the opengl context
displayed on the X11 server gets done in case of a application splat
out reset would be equal. When application restarts it will create
the opengl context new so no 3d issue.

Video cards are different to most other hardware you are dealing with.
They are a second processing core that you don't have full control
over and are different card to card to the point of being 100 percent
incompatible with each other.


Peter Dolding
Oren Laadan
2008-07-24 18:32:43 UTC
Permalink
Post by Peter Dolding
Post by Oren Laadan
Post by Eric W. Biederman
Post by Peter Dolding
On Mon, Jul 21, 2008 at 10:13 PM, Eric W. Biederman
Post by Eric W. Biederman
Post by Peter Dolding
http://opensolaris.org/os/community/brandz/ I would like to see if
something equal to this is on the roadmap in particular. Being able
to run solaris and aix closed source binaries contained would be
useful.
There have been projects to do this at various times on linux. Having
a namespace dedicated to a certain kind of application is no big deal.
Someone would need to care enough to test and implement it though.
Post by Peter Dolding
Other useful feature is some way to share a single process between PID
containers as like a container bridge. For containers used for
desktop applications not having a single X11 server interfacing with
video card is a issue.
X allows network connections, and I think unix domain sockets will work.
The latter I need to check on.
Does to a point until you see that local X11 is using shared memory
for speed. Hardest issue is getting GLX working.
That is easier in general. Don't unshare the sysvipc namespace.
Or share the mount of /dev/shmem at least for the file X cares about.
Post by Peter Dolding
Post by Eric W. Biederman
The pid namespace is well defined and no a task will not be able
to change it's pid namespace while running. That is nasty.
Ok if that is imposable to extremely risky.
What about a form of a proxy pid in the pid namespace proxying
application chatter between 1 name space to another. Applications
being the bridge if its not possible to do it invisible to application
could be made aware of it. So they can provide shared memory and the
like across pid namespaces. But only where they have a activated proxy
to do there bidding. This also allows applications to maintain there
own internal secuirty between namespaces.
Ie application is 1 pid number in its source container and virtual pid
numbers in the following containers. Symbolic linking at task level
yes a little warped. Yes this will annoying mean a special set of
syscalls and a special set of capabilities and restrictions. Like PID
containers starting up forbidding proxy pid's or allowing them.
If I am thinking right that avoids not be able to change it's pid.
Instead sending and receiving the messages you need in the other name
space threw a small proxy. Yes I know that will cost some
performance.
Proxy pids don't actually do anything for you, unless you want to send
signals. Because all of the namespaces are distinct. So even at the
best of it you can see the X server but it still can't use your
network sockets or ipc shm.
Better is working out the details on how to manipulate multiple
sysvipc and network namespaces from a single application. Mostly
that is supported now by the objects there is just no easy way
of dealing with it.
Post by Peter Dolding
Basically want to setup a neat universal container way of handling
stuff like http://www.cs.toronto.edu/~andreslc/xen-gl/ without having
to go network and hopefully in a way that limitations don't have to
exist since messages are really only be sent threw 1 X11 server to 1
driver system. Only thing is really sending the correct messages to
the correct place. There will most likely be other services were a
single entity at times is preferred. Worst out come is if proxying
.so is required.
Yes. I agree that is essentially desirable. Given that I think
high end video card actually have multiple hardware contexts that
can be mapped into different user space processes there may be other
ways of handling this.
Ideally we can find a high performance solution to X that also gives
us good isolation and migration properties. Certainly something to talk
about tomorrow in the conference.
In particular, if you wish to share private resources of a container
between more than a single container, then you won't be able to use
checkpoint/restart on neither container (unless you make special
provisions in the code).
I agree with Eric that the way to handle this is via virtualization
as opposed to direct sharing. The same goes for other hardware, e.g.
in the context of a user desktop - /dev/rtc, sound, and so on. My
experience is that a proxy/virtualized device is what we probably
want.
Oren.
Giving up means to use checkpoint cleanly on containers independent of
each other when using X11 might be a requirement. Reason in GPU
processing if you want to provide that a lot GPU's don't have a good
segmented freeze its either park the full GPU or risk issues on
startup. Features need to be added to GPU so we can suspend
individual opengl context's to make that work. So any application
using the GPU at most likely will have to be lost in a checkpoint
restore independent to the other X11 using the desktop.
Even suspending the GPU as a block there are still issues with some cards.
Sorry Oren from using http://www.virtualgl.org I know suspending GPU's
is trouble.
http://www.cs.toronto.edu/~andreslc/xen-gl/ blocks out all usage of
GPU for advance processing effectively crippling card. Virtualized
basically is not going to cut it. You need access to GPU for
particular software to work.
This is more containers being used by desktop users to run many
distributions at once.
Of course there is nothing stopping checkpoint process informing user
that they cannot go past this point in check pointing until the
following application are closed. Ie the ones using the GPU shader
processing and the like. We just have to wait for video card makers
to provide us with something equal intels and amd's cpu vitalisation
instructions to suspend independent opengl context's.
Multiple hardware contexts are many independent gpu's stuck on cards
just like sticking more video cards in a computer yes they can be
suspended independently yes how they are allocated should be
controllable, These are not on every card out there. Yet you want
migration sorry really bad new here. A suspend of a gpu has to be
loaded backup on exactly the same type of GPU or you are stuffed. 2
different model cards will not work. So this does not help you at all
with migration or even worse video card death. Most people forget
that a suspend using compiz or anything else in gpu cannot be restored
if you have change video cards to a different gpu. Brand card does
not help you here.
Full X11 with Fully functional opengl will mean giving some things up.
Means to keep every application running threw a migration or
checkpoint is impossable. Applications container/suspend aware could
have some form of internal rebuild opengl context after restore from a
point they can restart there processing loop from but they will have
to redo all there shader code and other in gpu processing code in case
of change of gpu type and even there engine internal paths. This
alteration would allow check pointing and migration back with
dependability but only if using aware applications.
X11 2d can suspend and restore without major issue as
http://partiwm.org/wiki/xpra shows. 3d is a bugger.
There is basically no magical trick to get around this problem.
Containers alone cannot solve it. Rare section with loss has to be
excepted to make it work. By it working will be like Xen when it
started started cpu makers looking at making it better.
Restart should be a zero issue. Clearing the opengl context
displayed on the X11 server gets done in case of a application splat
out reset would be equal. When application restarts it will create
the opengl context new so no 3d issue.
Video cards are different to most other hardware you are dealing with.
They are a second processing core that you don't have full control
over and are different card to card to the point of being 100 percent
incompatible with each other.
If you want to migrate containers with user desktops, you really have
to be able to load the state off the display hardware on the source
machine and re-instate that state on the display hardware of the target
machine. This is practically impossible given current hardware and the
variance between vendors, and probably won't change. Instead, you _must_
have a way to virtualize the display, for instance by using VNC. VNC is
ok for regular work, but is inefficient in many aspects. Projects like
THINC (http://www.ncl.cs.columbia.edu/research/thinc) improve on it by
making the remote display efficient to the point that you can actually
view movies with remote display. As far as I know the 3D case is not
solved efficiently as of yet.

Current solutions for running user desktop sessions in containers rely
on remote display to virtualize the display, such that rendering is
either done in software on the server or in hardware on the (stateless)
client side. In my opinion the same should apply for 3D graphics within
such environments, which probably means doing the actual rendering at
the client side.

Oren.
Peter Dolding
2008-07-25 03:32:54 UTC
Permalink
Post by Oren Laadan
Post by Peter Dolding
Post by Oren Laadan
Post by Eric W. Biederman
Post by Peter Dolding
On Mon, Jul 21, 2008 at 10:13 PM, Eric W. Biederman
Post by Eric W. Biederman
Post by Peter Dolding
http://opensolaris.org/os/community/brandz/ I would like to see if
something equal to this is on the roadmap in particular. Being able
to run solaris and aix closed source binaries contained would be
useful.
There have been projects to do this at various times on linux. Having
a namespace dedicated to a certain kind of application is no big deal.
Someone would need to care enough to test and implement it though.
Post by Peter Dolding
Other useful feature is some way to share a single process between PID
containers as like a container bridge. For containers used for
desktop applications not having a single X11 server interfacing with
video card is a issue.
X allows network connections, and I think unix domain sockets will work.
The latter I need to check on.
Does to a point until you see that local X11 is using shared memory
for speed. Hardest issue is getting GLX working.
That is easier in general. Don't unshare the sysvipc namespace.
Or share the mount of /dev/shmem at least for the file X cares about.
Post by Peter Dolding
Post by Eric W. Biederman
The pid namespace is well defined and no a task will not be able
to change it's pid namespace while running. That is nasty.
Ok if that is imposable to extremely risky.
What about a form of a proxy pid in the pid namespace proxying
application chatter between 1 name space to another. Applications
being the bridge if its not possible to do it invisible to application
could be made aware of it. So they can provide shared memory and the
like across pid namespaces. But only where they have a activated proxy
to do there bidding. This also allows applications to maintain there
own internal secuirty between namespaces.
Ie application is 1 pid number in its source container and virtual pid
numbers in the following containers. Symbolic linking at task level
yes a little warped. Yes this will annoying mean a special set of
syscalls and a special set of capabilities and restrictions. Like PID
containers starting up forbidding proxy pid's or allowing them.
If I am thinking right that avoids not be able to change it's pid.
Instead sending and receiving the messages you need in the other name
space threw a small proxy. Yes I know that will cost some
performance.
Proxy pids don't actually do anything for you, unless you want to send
signals. Because all of the namespaces are distinct. So even at the
best of it you can see the X server but it still can't use your
network sockets or ipc shm.
Better is working out the details on how to manipulate multiple
sysvipc and network namespaces from a single application. Mostly
that is supported now by the objects there is just no easy way
of dealing with it.
Post by Peter Dolding
Basically want to setup a neat universal container way of handling
stuff like http://www.cs.toronto.edu/~andreslc/xen-gl/ without having
to go network and hopefully in a way that limitations don't have to
exist since messages are really only be sent threw 1 X11 server to 1
driver system. Only thing is really sending the correct messages to
the correct place. There will most likely be other services were a
single entity at times is preferred. Worst out come is if proxying
.so is required.
Yes. I agree that is essentially desirable. Given that I think
high end video card actually have multiple hardware contexts that
can be mapped into different user space processes there may be other
ways of handling this.
Ideally we can find a high performance solution to X that also gives
us good isolation and migration properties. Certainly something to talk
about tomorrow in the conference.
In particular, if you wish to share private resources of a container
between more than a single container, then you won't be able to use
checkpoint/restart on neither container (unless you make special
provisions in the code).
I agree with Eric that the way to handle this is via virtualization
as opposed to direct sharing. The same goes for other hardware, e.g.
in the context of a user desktop - /dev/rtc, sound, and so on. My
experience is that a proxy/virtualized device is what we probably
want.
Oren.
Giving up means to use checkpoint cleanly on containers independent of
each other when using X11 might be a requirement. Reason in GPU
processing if you want to provide that a lot GPU's don't have a good
segmented freeze its either park the full GPU or risk issues on
startup. Features need to be added to GPU so we can suspend
individual opengl context's to make that work. So any application
using the GPU at most likely will have to be lost in a checkpoint
restore independent to the other X11 using the desktop.
Even suspending the GPU as a block there are still issues with some cards.
Sorry Oren from using http://www.virtualgl.org I know suspending GPU's
is trouble.
http://www.cs.toronto.edu/~andreslc/xen-gl/ blocks out all usage of
GPU for advance processing effectively crippling card. Virtualized
basically is not going to cut it. You need access to GPU for
particular software to work.
This is more containers being used by desktop users to run many
distributions at once.
Of course there is nothing stopping checkpoint process informing user
that they cannot go past this point in check pointing until the
following application are closed. Ie the ones using the GPU shader
processing and the like. We just have to wait for video card makers
to provide us with something equal intels and amd's cpu vitalisation
instructions to suspend independent opengl context's.
Multiple hardware contexts are many independent gpu's stuck on cards
just like sticking more video cards in a computer yes they can be
suspended independently yes how they are allocated should be
controllable, These are not on every card out there. Yet you want
migration sorry really bad new here. A suspend of a gpu has to be
loaded backup on exactly the same type of GPU or you are stuffed. 2
different model cards will not work. So this does not help you at all
with migration or even worse video card death. Most people forget
that a suspend using compiz or anything else in gpu cannot be restored
if you have change video cards to a different gpu. Brand card does
not help you here.
Full X11 with Fully functional opengl will mean giving some things up.
Means to keep every application running threw a migration or
checkpoint is impossable. Applications container/suspend aware could
have some form of internal rebuild opengl context after restore from a
point they can restart there processing loop from but they will have
to redo all there shader code and other in gpu processing code in case
of change of gpu type and even there engine internal paths. This
alteration would allow check pointing and migration back with
dependability but only if using aware applications.
X11 2d can suspend and restore without major issue as
http://partiwm.org/wiki/xpra shows. 3d is a bugger.
There is basically no magical trick to get around this problem.
Containers alone cannot solve it. Rare section with loss has to be
excepted to make it work. By it working will be like Xen when it
started started cpu makers looking at making it better.
Restart should be a zero issue. Clearing the opengl context
displayed on the X11 server gets done in case of a application splat
out reset would be equal. When application restarts it will create
the opengl context new so no 3d issue.
Video cards are different to most other hardware you are dealing with.
They are a second processing core that you don't have full control
over and are different card to card to the point of being 100 percent
incompatible with each other.
If you want to migrate containers with user desktops, you really have
to be able to load the state off the display hardware on the source
machine and re-instate that state on the display hardware of the target
machine. This is practically impossible given current hardware and the
variance between vendors, and probably won't change. Instead, you _must_
have a way to virtualize the display, for instance by using VNC. VNC is
ok for regular work, but is inefficient in many aspects. Projects like
THINC (http://www.ncl.cs.columbia.edu/research/thinc) improve on it by
making the remote display efficient to the point that you can actually
view movies with remote display. As far as I know the 3D case is not
solved efficiently as of yet.
Current solutions for running user desktop sessions in containers rely
on remote display to virtualize the display, such that rendering is
either done in software on the server or in hardware on the (stateless)
client side. In my opinion the same should apply for 3D graphics within
such environments, which probably means doing the actual rendering at
the client side.
The simple problem that is not possible and most likely never will be.
How to virtual opengl for VNC is use a program call
http://virtualgl.org and that is not state less and gpu dependent.

You are keeping on forgetting high end stuff like glsl need to be
processed in video cards gpu's to emulate need really large cpu's. 2d
rendering is simple to do stateless.

Your opinion is simply not workable. You are not thinking of the
problem correct.

Lets say you want to migrate between X86 and PPC running applications
using containers how are you going to do it. This is really the
level of complexity you have inside video card gpu's. They are that
far different. Programs have to generate there glsl code to suit the
video card they are talking to or it will not work correctly. Reason
why some games have on box only NVidia or only ATI gpu's.

Now before you say emulate. Like to point something nasty out.
State dump of a gpu is basically not documented its a black box so
emulation is not a option. Even if you could you are talking about
emulating something that will need the power of a 16 core 4 ghz intel
processor if it has only 1 GPU to emulate. Note some video cards have
upto 4. How effective gpu's are at doing there job is highly under
estimated .

Yes the issue is some opengl programs can be done stateless up until
they start using shader languages physics and other things in GPU.
Past that point stateless and gpu independent stuff. Lots and lots of
programs need the GPU dependant stuff.

Virtualgl does rendering server side due to the massive amounts of
data that can be travelling between the cpu and gpu. It really like
saying we are not going to have a maths processer in the server and
every time you want to do some maths you have to go threw network to
do it.

GPU are not just drawing to screen. They do all kinds of things
these days. They have to be used locally to the program running there
is not enough network bandwidth and they will not run right.

Peter Dolding
Eric W. Biederman
2008-07-26 07:05:19 UTC
Permalink
Post by Peter Dolding
The simple problem that is not possible and most likely never will be.
How to virtual opengl for VNC is use a program call
http://virtualgl.org and that is not state less and gpu dependent.
You are keeping on forgetting high end stuff like glsl need to be
processed in video cards gpu's to emulate need really large cpu's. 2d
rendering is simple to do stateless.
Your opinion is simply not workable. You are not thinking of the
problem correct.
Lets say you want to migrate between X86 and PPC running applications
using containers how are you going to do it. This is really the
level of complexity you have inside video card gpu's. They are that
far different. Programs have to generate there glsl code to suit the
video card they are talking to or it will not work correctly. Reason
why some games have on box only NVidia or only ATI gpu's.
Now before you say emulate. Like to point something nasty out.
State dump of a gpu is basically not documented its a black box so
emulation is not a option. Even if you could you are talking about
emulating something that will need the power of a 16 core 4 ghz intel
processor if it has only 1 GPU to emulate. Note some video cards have
upto 4. How effective gpu's are at doing there job is highly under
estimated .
Yes the issue is some opengl programs can be done stateless up until
they start using shader languages physics and other things in GPU.
Past that point stateless and gpu independent stuff. Lots and lots of
programs need the GPU dependant stuff.
Virtualgl does rendering server side due to the massive amounts of
data that can be travelling between the cpu and gpu. It really like
saying we are not going to have a maths processer in the server and
every time you want to do some maths you have to go threw network to
do it.
GPU are not just drawing to screen. They do all kinds of things
these days. They have to be used locally to the program running there
is not enough network bandwidth and they will not run right.
I need to research this some more, but migration is essentially the
same problem as suspend and hibernation (admittedly different kinds of
video make this trickier). Migration and hibernation we can handle
today with hardware going away and coming back. It sounds like to me
the most general approach is a light-weight proxy X server that can
forward things on the slow path and allow the fast path accesses to
go fast.

Eric
Peter Dolding
2008-07-26 10:06:05 UTC
Permalink
On Sat, Jul 26, 2008 at 5:05 PM, Eric W. Biederman
Post by Eric W. Biederman
Post by Peter Dolding
The simple problem that is not possible and most likely never will be.
How to virtual opengl for VNC is use a program call
http://virtualgl.org and that is not state less and gpu dependent.
You are keeping on forgetting high end stuff like glsl need to be
processed in video cards gpu's to emulate need really large cpu's. 2d
rendering is simple to do stateless.
Your opinion is simply not workable. You are not thinking of the
problem correct.
Lets say you want to migrate between X86 and PPC running applications
using containers how are you going to do it. This is really the
level of complexity you have inside video card gpu's. They are that
far different. Programs have to generate there glsl code to suit the
video card they are talking to or it will not work correctly. Reason
why some games have on box only NVidia or only ATI gpu's.
Now before you say emulate. Like to point something nasty out.
State dump of a gpu is basically not documented its a black box so
emulation is not a option. Even if you could you are talking about
emulating something that will need the power of a 16 core 4 ghz intel
processor if it has only 1 GPU to emulate. Note some video cards have
upto 4. How effective gpu's are at doing there job is highly under
estimated .
Yes the issue is some opengl programs can be done stateless up until
they start using shader languages physics and other things in GPU.
Past that point stateless and gpu independent stuff. Lots and lots of
programs need the GPU dependant stuff.
Virtualgl does rendering server side due to the massive amounts of
data that can be travelling between the cpu and gpu. It really like
saying we are not going to have a maths processer in the server and
every time you want to do some maths you have to go threw network to
do it.
GPU are not just drawing to screen. They do all kinds of things
these days. They have to be used locally to the program running there
is not enough network bandwidth and they will not run right.
I need to research this some more, but migration is essentially the
same problem as suspend and hibernation (admittedly different kinds of
video make this trickier). Migration and hibernation we can handle
today with hardware going away and coming back. It sounds like to me
the most general approach is a light-weight proxy X server that can
forward things on the slow path and allow the fast path accesses to
go fast.
Issue is light-weight proxy fails as soon as you start dealing with
the GPU processing stuff.

This has already been fairly heavily researched. Complete system
suspend works because you take the complete gpu off line and restore
it back to where it was on the same GPU.

Migration is worse than you can dream. Even the same model GPU loaded
with a different GPU save state can fail if maker has altered paths
due to damage in 1 chip. I see no trick around this. Even GPU's on
the same card can fail if you try to restore the wrong state back into
them.

Using a light-weight proxy you will be able to tag applications using
advanced GPU instructions that will not migrate or suspend happily.

http://partiwm.org/wiki/xpra Is one of the ones you will want to work
with when building a light weight proxy. This allows X11
applications to be disconnected from one X11 server and connected to
another.

http://www.cs.toronto.edu/~andreslc/xen-gl/ Also got so far along
forwarding opengl.

Its more you will have to sort. It will break down into 4.

2d and maybe some opengl suspendable because interface can be simplely
regenerated on a new X11 server using nothing GPU targeted.

Using GPU heavily with a detectable resend of everything to gpu with
corrections for change of video card. This would cover some games
that at change of level and restart game engine with diagnostics.
This would be more setting fixed suspend points where application
could only be suspended at those points. Some applications that do
some gpu off loading of tasks. Ie comes like trying to stop a
critical section of kernel have to wait.

Using GPU no clean suspend points. Has to be lost in the suspend or
transfer process.

Final would be a way is to build in a Xen system for GPU access a
cooperative setup were application can be told to suspend and restore
self.

Most likely all 4 types will be needed to cover everything. Or close
to everything you can.

This is the issue you are dealing with exactly in pcode.

start of program
Detect GPU
create needed gpu code
send gpu code to gpu.

Later on in program
Call the uploaded gpu code in the first place.
do some code.
get the results from gpu call earlier.
do some code
get another lot of results from the same gpu function started before.
.... repeating until program completes.

Reason someone run something like a random number provider in the gpu.
Now stoping and restoring anywhere in that is going to need
knowledge of what is going on in the gpu you cannot get.

Basically a complete parallel thread can be looping away in the gpu.
Without inspecting the gpu code or seeing it directly terminated you
don't know if its going to be accessed again. And if it should be
running and its not program splat. Even worse if its some form of
collective state like gpu is simulating gravity effects on char or
rebound effects. Even throwing the starting program into the gpu is
not going to work because its not going to have the correct state.

Issue with inspecting the gpu lot of programs will be outputing for
ATI Nvidia gpus then even specialising that code down to the mode gpu
from there code base.

Worst bit application may not be graphical at all
http://graphics.stanford.edu/projects/brookgpu/ might be something
using this to off load processing to the GPU.

Suspend is basically imposable on a application using gpu without
suspending everything else in that gpu. Until that feature is added
to gpu. Even worse migration is not dependable between gpu's on the
same card let alone different cards. So other paths will have to be
taken.

GPU is the problem. Most of the other rendering stuff is simple to
solve. GPU are a stack of rogue threads.

There is no magical way around this problem. We all wish there were.
Every nice virtual solution is a brick wall. Mixed solution is
kinda needed. Also the reason why I did not care if desktop use
complete broke mean to migrate and suspend its a lot simpler that way.

Of course you might dream up some way I have not even considered.

Peter Dolding

PS own GPU language will only work for newly code applications also
will be slower as some programs build there own gpu code locallly in
advance and upload it on demand.
Eric W. Biederman
2008-07-26 13:56:45 UTC
Permalink
Post by Peter Dolding
Issue is light-weight proxy fails as soon as you start dealing with
the GPU processing stuff.
This has already been fairly heavily researched. Complete system
suspend works because you take the complete gpu off line and restore
it back to where it was on the same GPU.
Darn. 2D X (the last I looked at it) has the property where the applications
are required at any point to be able to repaint all of their windows. It sounds
like this is not the case for 3D X applications and it is a real shame.

I also suspect that even if I use the repaint trick do it in such a way
that I am allowed to switch the gpu buggy applications will fail.
Post by Peter Dolding
Migration is worse than you can dream. Even the same model GPU loaded
with a different GPU save state can fail if maker has altered paths
due to damage in 1 chip. I see no trick around this. Even GPU's on
the same card can fail if you try to restore the wrong state back into
them.
Not really. I know that direct hardware access by application programs
is a problem. My practical question is how robust programs are to hardware
hotplug.
Post by Peter Dolding
Using a light-weight proxy you will be able to tag applications using
advanced GPU instructions that will not migrate or suspend happily.
That sounds very practical.
Post by Peter Dolding
http://partiwm.org/wiki/xpra Is one of the ones you will want to work
with when building a light weight proxy. This allows X11
applications to be disconnected from one X11 server and connected to
another.
http://www.cs.toronto.edu/~andreslc/xen-gl/ Also got so far along
forwarding opengl.
Its more you will have to sort. It will break down into 4.
2d and maybe some opengl suspendable because interface can be simplely
regenerated on a new X11 server using nothing GPU targeted.
Using GPU heavily with a detectable resend of everything to gpu with
corrections for change of video card. This would cover some games
that at change of level and restart game engine with diagnostics.
This would be more setting fixed suspend points where application
could only be suspended at those points. Some applications that do
some gpu off loading of tasks. Ie comes like trying to stop a
critical section of kernel have to wait.
Using GPU no clean suspend points. Has to be lost in the suspend or
transfer process.
Yep.
Post by Peter Dolding
GPU is the problem. Most of the other rendering stuff is simple to
solve. GPU are a stack of rogue threads.
There is no magical way around this problem. We all wish there were.
Every nice virtual solution is a brick wall. Mixed solution is
kinda needed. Also the reason why I did not care if desktop use
complete broke mean to migrate and suspend its a lot simpler that way.
Totally and if we have the capabilities today to make it work without
the possibility of suspend/resume I am happy with that.
Post by Peter Dolding
Of course you might dream up some way I have not even considered.
Certainly worth thinking about.

For I am concerned with getting the basics working solid. So I don't
plan on working on any of this any time soon. Other contributions
are welcome.
Post by Peter Dolding
Peter Dolding
PS own GPU language will only work for newly code applications also
will be slower as some programs build there own gpu code locallly in
advance and upload it on demand.
Nah. I'm happy so long as we have a way to say: "Hey! You silly application
that was talking directly to the hardware. You have to deal with new hardware
now. Cope."

Eric
Peter Dolding
2008-07-27 05:17:39 UTC
Permalink
On Sat, Jul 26, 2008 at 11:56 PM, Eric W. Biederman
Post by Eric W. Biederman
Post by Peter Dolding
Issue is light-weight proxy fails as soon as you start dealing with
the GPU processing stuff.
This has already been fairly heavily researched. Complete system
suspend works because you take the complete gpu off line and restore
it back to where it was on the same GPU.
Darn. 2D X (the last I looked at it) has the property where the applications
are required at any point to be able to repaint all of their windows. It sounds
like this is not the case for 3D X applications and it is a real shame.
I also suspect that even if I use the repaint trick do it in such a way
that I am allowed to switch the gpu buggy applications will fail.
All opengl and gpu using programs would fail to that switch. Reason
3d data was sent to video card for processing into a texture then
display for non gpu opengl. If you have those calls captured and
recorded you can replay them. With non gpu opengl it is possible to
keep a active list of data to put back as long as they don't use any
vendor dependant extensions. Same model card would work here even
with extentions.

3d program moves far more data than 2d and could not afford to be
sending it all the time.
Post by Eric W. Biederman
Post by Peter Dolding
Migration is worse than you can dream. Even the same model GPU loaded
with a different GPU save state can fail if maker has altered paths
due to damage in 1 chip. I see no trick around this. Even GPU's on
the same card can fail if you try to restore the wrong state back into
them.
Not really. I know that direct hardware access by application programs
is a problem. My practical question is how robust programs are to hardware
hotplug.
Most opengl programs are not robust to hotplug of video card so answer
that with crash if there active card is removed and don't detect and
use the added opengl card until they get to a detection point.
Post by Eric W. Biederman
Post by Peter Dolding
Using a light-weight proxy you will be able to tag applications using
advanced GPU instructions that will not migrate or suspend happily.
That sounds very practical.
Post by Peter Dolding
http://partiwm.org/wiki/xpra Is one of the ones you will want to work
with when building a light weight proxy. This allows X11
applications to be disconnected from one X11 server and connected to
another.
http://www.cs.toronto.edu/~andreslc/xen-gl/ Also got so far along
forwarding opengl.
Its more you will have to sort. It will break down into 4.
2d and maybe some opengl suspendable because interface can be simplely
regenerated on a new X11 server using nothing GPU targeted.
Using GPU heavily with a detectable resend of everything to gpu with
corrections for change of video card. This would cover some games
that at change of level and restart game engine with diagnostics.
This would be more setting fixed suspend points where application
could only be suspended at those points. Some applications that do
some gpu off loading of tasks. Ie comes like trying to stop a
critical section of kernel have to wait.
Using GPU no clean suspend points. Has to be lost in the suspend or
transfer process.
Yep.
Post by Peter Dolding
GPU is the problem. Most of the other rendering stuff is simple to
solve. GPU are a stack of rogue threads.
There is no magical way around this problem. We all wish there were.
Every nice virtual solution is a brick wall. Mixed solution is
kinda needed. Also the reason why I did not care if desktop use
complete broke mean to migrate and suspend its a lot simpler that way.
Totally and if we have the capabilities today to make it work without
the possibility of suspend/resume I am happy with that.
Post by Peter Dolding
Of course you might dream up some way I have not even considered.
Certainly worth thinking about.
For I am concerned with getting the basics working solid. So I don't
plan on working on any of this any time soon. Other contributions
are welcome.
Post by Peter Dolding
Peter Dolding
PS own GPU language will only work for newly code applications also
will be slower as some programs build there own gpu code locallly in
advance and upload it on demand.
Nah. I'm happy so long as we have a way to say: "Hey! You silly application
that was talking directly to the hardware. You have to deal with new hardware
now. Cope."
Eric
I think having a two messages. Hey program I am going to nick your
hardware prepare yourself before suspend and then you possible have
new hardware cope. This could become like a general way for all
direct hardware using programs.

Peter Dolding

Loading...